Changeset 7381:ab8cb00ecf6a in orange


Ignore:
Timestamp:
02/04/11 01:32:36 (3 years ago)
Author:
miha <miha.stajdohar@…>
Branch:
default
Convert:
37fbdb3df91d97b168827615fd0126c4b0370300
Message:
 
File:
1 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/network/__init__.py

    r7364 r7381  
    1010.. autoclass:: Orange.network.Network 
    1111   :members: 
     12    
     13Examples 
     14======== 
     15 
     16Reading and saving a network 
     17---------------------------- 
     18 
     19This example demonstrates reading a network. Network class can read or write  
     20Pajek (.net) or GML file format. 
     21 
     22`network-read.py`_ (uses: `K5.net`_): 
     23 
     24.. literalinclude:: code/network-read.py 
     25    :lines: 4-5 
     26 
     27Visualize a network in NetExplorer widget 
     28----------------------------------------- 
     29 
     30This example demonstrates how to display a network in NetExplorer. 
     31 
     32part of `network-widget.py`_ 
     33 
     34.. literalinclude:: code/network-widget.py 
     35    :lines: 10-16 
     36     
     37.. image:: files/network-explorer.png 
     38    :width: 100% 
    1239    
    1340=========================== 
     
    1845       :members: 
    1946       :exclude-members: collapse  
    20     
    21 ============================= 
    22 Community Detection in Graphs 
    23 ============================= 
    24  
    25 .. autoclass:: Orange.network.NetworkClustering 
    26    :members: 
    27     
    28     
    29 ======== 
     47        
    3048Examples 
    3149======== 
    3250 
    3351Network constructor and random layout 
    34 ===================================== 
     52------------------------------------- 
    3553 
    3654In our first example we create a Network object with a simple full graph (K5).  
     
    4967 
    5068Network layout optimization 
    51 =========================== 
     69--------------------------- 
    5270 
    5371This example demonstrates how to optimize network layout using one of included  
     
    7189 
    7290.. image:: files/network-K5-fr.png 
    73  
    74 Reading and saving a network 
    75 ============================ 
    76  
    77 This example demonstrates reading a network. Network class can read or write  
    78 Pajek (.net) or GML file format. 
    79  
    80 `network-read.py`_ (uses: `K5.net`_): 
    81  
    82 .. literalinclude:: code/network-read.py 
    83     :lines: 4-5 
    84  
    85 Visualize a network in NetExplorer widget 
    86 ========================================= 
    87  
    88 This example demonstrates how to display a network in NetExplorer. 
    89  
    90 part of `network-widget.py`_ 
    91  
    92 .. literalinclude:: code/network-widget.py 
    93     :lines: 10-16 
    94      
    95 .. image:: files/network-explorer.png 
    96     :width: 100% 
     91    
     92====== 
     93Graphs 
     94====== 
     95 
     96Orange offers a data structure for representing directed and undirected graphs  
     97with various types of weighted connections. 
     98 
     99Basic graphs have only one type of edges. Each edge can have an associated  
     100number, representing a strength of the edge - with whatever underlying  
     101physical interpretation. Orange's graphs are more general and two vertices  
     102can be connected by edges of various types. One use for this would be in  
     103genetics, where one gene can excite or inhibit another - or both  
     104simultaneously, which is why we can't simply assign negative numbers to the  
     105edges. The number of edge types is unlimited, but needs to be set in advance. 
     106 
     107Before constructing a graph, you will also need to decide for the underlying  
     108data structure. The differences for smaller graphs (e.g. with less than  
     109100 nodes) should be insignificant, while for the larger, the decision should  
     110be based upon the expected number of edges ("density" of the graph) and  
     111the operations you plan to execute. Graphs with large number of edges  
     112(eg.>n2/4, where n is the number of vertices) should be represented with  
     113adjacency matrices (class GraphAsMatrix), graphs with small number of edges  
     114with adjacency lists (GraphAsList) and those in between with adjacency trees  
     115(GraphAsTrees). Regarding the speed, matrices are generally the fastest, while  
     116some operations, such as finding all edges leading from certain node, will  
     117sometimes be faster with lists or trees. 
     118 
     119One thing that is not supported (at the moment?) are multiple edges of the  
     120same type between two vertices. 
     121 
     122Construction 
     123============ 
     124 
     125When constructing a graph, you will need to decide about the data structure for 
     126representation of edges, and call the corresponding constructor. All 
     127constructors take the same arguments: the number of vertices (needs to be given 
     128in advance, you cannot add additional vertices later), a flag telling whether 
     129the graph is directed or not, and the number of edge types. The default number 
     130of edge types is 1 (a normal graph), while the other two arguments are 
     131mandatory. 
     132 
     133You can choose between three constructors, all derived from a single ancestor 
     134Graph: 
     135 
     136.. class:: GraphAsMatrix(nVertices, directed[, nEdgeTypes]) 
     137 
     138    Edges are stored in a matrix with either n2 or n(n+1)/2 elements, depending 
     139    upon whether the graph is directed or not. (In C++, it is stored as float * 
     140    pointing to an array of length n*n*nEdgeTypes or (n*(n+1))/2*nEdgeTypes 
     141    elements, where nEdgeTypes is the number of edge types.) This 
     142    representation is suitable for smaller graphs and for dense large graphs. 
     143    For graph with only one edge type, this representation is more economical 
     144    than representation with lists or trees when the number of edges is larger 
     145    than n2/4. Inserting, deleting and checking the edges is fast; listing the 
     146    neighbours of a certain node is fast unless the graph is sparse, in which 
     147    case a graph represented with a list or a tree would be faster. 
     148     
     149.. class:: GraphAsList(nVertices, directed[, nEdgeTypes])  
     150     
     151    Edges are stored in an ordered lists of neighbours, one list for each node. 
     152    In C++, for each neighbour, the connection is stored in a structure with 
     153    the vertex number (int), a pointer to the next structure, and an array of 
     154    floats, one for each integer. With 16-byte alignment, this would take 16 
     155    bytes for graphs with one or two edge types on the usual 32-bit platforms. 
     156    For undirected graphs, each edge is stored only once, in the list of the 
     157    edge with the smaller index. This makes the structure smaller and insertion 
     158    and lookup faster; it slows down finding the neighbours of a given node. 
     159    This structure is convenient for graphs with a very small number of edges. 
     160    For them, inserting and removing edges is relatively fast; getting all 
     161    edges leading from a vertex is fast, while getting edges leading to a 
     162    vertex or getting all neighbours (in directed or undirected graph) is slow. 
     163     
     164.. class:: GraphAsTree(nVertices, directed[, nEdgeTypes])  
     165     
     166    This structure is similar to GraphAsTree except that the edges are stored 
     167    in trees instead of lists. This should be a structure of choice for all 
     168    graph between really sparse and those having one quarter of possible edges. 
     169    As expected, queries are fast, while insertion and removal of edges is 
     170    somewhat slower (though still faster than for GraphAsList unless the number 
     171    of edges is really small). Internally, nodes of the tree contain the vertex 
     172    number, two pointers and a list of floats. With one edge type, this should 
     173    be 16 bytes on 32-bit platforms. An ordinary undirected graph with 10 
     174    vertices stored in a matrix would thus be constructed by 
     175 
     176Examples 
     177-------- 
     178 
     179graph = orange.GraphAsMatrix(10, 0) 
     180 
     181A directed graph with 1000 vertices and edges of three types, stored with 
     182adjacency trees would be constructed by 
     183 
     184graph = orange.GraphAsTree(1000, 1, 3) 
     185 
     186Usage 
     187===== 
     188 
     189All three graph types are used the same way, independent of the underlying 
     190structure. All methods are defined in basic class :obj:Orange.network.Graph. 
     191 
     192.. class:: Graph 
     193 
     194    .. attribute:: nVertices 
     195     
     196        The number of vertices (read-only, set at construction) 
     197     
     198    .. attribute:: nEdgeTypes 
     199     
     200        The number of different edge types (read-only, set at construction) 
     201     
     202    .. attribute:: directed 
     203     
     204        Tells whether the graph is directed (read-only, set at construction) 
     205     
     206    .. attribute:: objects 
     207     
     208        A dictionary, list or other sequence of objects that correspond to 
     209        graph nodes. The use of this object is described in section on 
     210        indexing. 
     211 
     212    .. attribute:: forceMapping 
     213 
     214        Determines whether to map integer indices through 'objects'. Details are 
     215        described below. 
     216 
     217    .. attribute:: returnIndices 
     218 
     219        If set, the methods that return list of neighbours will return lists of 
     220        integers even when objects are given. 
     221 
     222    .. Basic Operations 
     223    .. ---------------- 
     224 
     225    .. Indexing 
     226    .. -------- 
     227     
     228    Vertices are referred to by either integer indices or Python objects of any 
     229    type. In the latter case, a mapping should be provided by assigning the 
     230    'objects' attribute. For instance, if you set graph.objects to ["Age", 
     231    "Gender", "Height", "Weight"] then graph["Age", "Height"] would be equivalent 
     232    to graph[0, 2] and graph.getNeighbours("Weight") to graph.getNeighbours(3). 
     233    Vertex identifier doesn't need to be a string, , it can be any Python 
     234    object. 
     235     
     236    If objects contains a dictionary, its keys are vertex identifiers and the 
     237    values in the dictionary should be integers, eg. 
     238     
     239    graph.objects = {} 
     240    graph.objects["Age"] = 0 
     241    graph.objects[None] = 1 
     242    graph.objects[orange] = 4 
     243 
     244    If not a dictionary, objects can be any kind of sequence. Usually, you will 
     245    give it a list of the same length as the number of vertices in the graph, 
     246    so each element would identify a vertex. When indexing, the index is sought 
     247    for in the list. objects can also be a list of attributes, a domain, an 
     248    example table or even a single string; Orange will run a code equivalent to 
     249    "for o in graph.objects", so everything for which such a loop works, goes. 
     250     
     251    Searching through the list is, of course, rather slow, so it is recommended 
     252    to use integer indices for larger graphs. So, if you request 
     253    graph.getNeighbours(0), the method will return the neighbours of the first 
     254    vertex, even if objects is given. But - what if you want to map all 
     255    indices, even integers, through objects? In this case, you need to set 
     256    graph.forceMapping to 1. If graph.forceMapping is set and graph.objects is 
     257    given, even getNeighbours(0) will search the graph.objects for 0 and return 
     258    the neighbours of the corresponding (not necessarily the first) node. 
     259 
     260    .. Getting and Setting Edges 
     261    .. ------------------------- 
     262     
     263    .. method:: graph[v1, v2] 
     264     
     265        For ordinary graphs with a single edge type, graph[v1, v2] returns the 
     266        weight of the edge between v1 and v2, or None if there is no edge 
     267        (edge's weight can also be 0). Edges can also be set by assigning them 
     268        a weight, e.g.graph[2, 5]=1.5. As described above, if objects is set, 
     269        we can use other objects, such as names, as v1 and v2 (the same goes 
     270        for all other functions described below). For graphs with multiple edge 
     271        types, graph[v1, v2] returns a list of weights for various edge types. 
     272        Some (or all, if there is no edge) elements of the list can be None. If 
     273        the edge does not exist, graph[v1, v2] returns a list of Nones, not a 
     274        None. You can assign a list to graph[v1, v2]; in graph with three edge 
     275        types you can set graph[2, 5] = [1.5, None, -2.0]. After that, there 
     276        are two edges between vertices 2 and 5, one of the first type with 
     277        weight 1.5, and one of the third with weight -2.0. To remove an edge, 
     278        you can assign it a least of Nones or a single None, e.g. graph[2, 
     279        5]=None; this removes edges of all types between the two nodes. The 
     280        list returned for graphs with multiple edge types is actually a 
     281        reference to the edge, therefore you can set e = graph[v1, v2] and then 
     282        manipulate e, for instance e[1]=10 or e[0]=None. Edge will behave just 
     283        as an ordinary list (well, almost - no slicing ets). However, don't try 
     284        to assign a list to e, eg e=[1, None, 4]. This assigns a list to e, not 
     285        to the corresponding edge... graph[v1, v2, type] This is defined only 
     286        for graph with multiple edge types; it returns the weight for the edge 
     287        of type type between v1 and v2, or None if there is no such edge. You 
     288        can also establish an edge by assigning a weight (e.g. graph[2, 5, 2] = 
     289        -2.0) or remove it by assigning it a None (graph[2, 5, 2] = None). 
     290        edgeExists(v1, v2[, type]) Returns true if the edge between v1 and v2 
     291        exists. For multiple edge type graphs you can also specify the type of 
     292        the edge you check for. If the third argument is omitted, the method 
     293        returns true if there is any kind of edge between the two vertices. It 
     294        is recommended to use this method when you want to check for a node. In 
     295        single edge type graphs, if graph[v1, v2]: will fail when there is an 
     296        edge but it has a weight of zero. With multiple edge types, if 
     297        graph[v1, v2]: will always success since graph[v1, v2] returns a non- 
     298        empty list; if there is no edges, this will be a list of Nones, but 
     299        Python still treats it as "true". addCluster(list_of_vertices) Creates 
     300        a cluster - adds edges between all listed vertices. Queries 
     301         
     302        Graph provides a set of function that return nodes connected to a 
     303        certain node. 
     304     
     305    .. method:: getNeighbours(v1[, type]) 
     306     
     307        Returns all the nodes that are connected to v1. In directed graphs, 
     308        this includes vertices with edges toward or from v1. In graphs with 
     309        multiple edge types you can also specify the edge type you are 
     310        interested in: getNeighbours will the return only the vertices that are 
     311        connected to v1 by edges of that type. 
     312     
     313    .. method:: getEdgesFrom(v1[, type]) 
     314     
     315        Return all the vertices which are connected to v1 by the edges leading 
     316        from v1. In edges with multiple edge types, you can specify the edge 
     317        type. In undirected graph, this function is equivalent to 
     318        getNeighbours. 
     319     
     320    .. method:: getEdgesTo(v1[, type]) 
     321     
     322        Returns all the vertices with edges leading to v1. Again, you can 
     323        decide for a single edge type to observe, and, again again, in 
     324        undirected graphs this function is equivalent to getNeighbours. 
     325         
     326        If objects is set, functions return a list of objects (names of 
     327        vertices or whatever objects you stored in objects). Otherwise, a list 
     328        of integer indices is returned. If you want to force Graph to return 
     329        integer indices even if objects is set, set graph.returnIndices to 
     330        True. 
     331         
     332        Of the three operations, the expensive one is to look for the vertices 
     333        with edges pointing to the given edge. There is no problem when graph 
     334        is represented as a matrix (graphAsMatrix); these are always fast. On 
     335        directed graph, getEdgeFrom is always fast as well. 
     336         
     337        In undirected graphs represented as lists or trees, the edge between 
     338        vertices with indices v1 and v2 is stored at the list/tree in the 
     339        smaller of the two indices. Therefore to list all neighbours of v1, 
     340        edges with v2<v1 are copied form v1's list, while for edges with v2>v1 
     341        the function needs to look for v1 in each v2's list/tree. Lookup in 
     342        trees is fast, while in representation with adjacency list, the 
     343        function is slower for v1 closer to nVertices/2. If v1 is small there 
     344        is a great number of v2>v1 whose lists are to be checked, but since the 
     345        lists are ordered, v1 is more to the beginning of these lists (and when 
     346        a vertex with index higher than v1 is encountered, we know that v1 is 
     347        not on the list). If v2 is great, there it is more toward the end of 
     348        the list, but there is smaller number of lists to be checked. 
     349        Generally, the average number of traversed list elements for 
     350        getNeighbours/getEdgesFrom/getEdgesTo on undirected graphs with 
     351        p*nVertices2 edges is p(nVertices-v1)v1. 
     352         
     353        To sum up, if you have a large undirected graph and intend to query for  
     354        neighbours (or, equivalently, edges to or from a node) a lot, don't use  
     355        GraphAsList. If the graph is small or you won't use these functions, it  
     356        doesn't matter. 
     357         
     358        For directed graphs, getEdgesFrom is trivial. The other two functions 
     359        are even slower than for undirected graphs, since to find the edges 
     360        leading from any vertex to a given one, all lists/trees need to be 
     361        searched. So, if your algorithm will extensively use getEdgesTo or 
     362        getNeighbours and your graph is large but the number of edges is less 
     363        than nEdges2/2, you should use GraphAsTree or, to be faster but consume 
     364        more memory store the graph as GraphAsMatrix. If the number of edges is 
     365        greater, GraphAsMatrix is more economic anyway. (This calculation is 
     366        for graph with only one edge type; see the description of graph types 
     367        for details. 
     368         
     369        However, this is all programmed in C++, so whatever you do, the 
     370        bottleneck will probably still be in your Python code and not in C++. 
     371        You probably cannot miss by using GraphAsTree disregarding the size of 
     372        the graph and the operations you perform on it. 
     373     
     374    .. Graph Analyses 
     375    .. -------------- 
     376     
     377    .. method:: getConnectedComponents() 
     378     
     379        Returns list of all connected components sorted descending by component 
     380        size. 
     381     
     382    .. method:: getDegreeDistribution() 
     383     
     384        Returns degree distribution as dictionary of type 
     385        {degree:number_of_vertices}. 
     386     
     387    .. method:: getDegrees() 
     388     
     389        Returns list of degrees. List size matches number of vertices. Index of 
     390        given degree matches index of corresponding vertex. 
     391     
     392    .. method:: getHubs(n) 
     393     
     394        Returns list of n largest hubs. 
     395     
     396    .. method:: getShortestPaths(u, v) 
     397     
     398        Returns list of vertices in the shortest path between u and v. 
     399     
     400    .. method:: getDistance(u, v) 
     401     
     402        Returns a distance between vertices u and v. 
     403     
     404    .. method:: getDiameter() 
     405     
     406        Returns a diameter of the graph. 
     407 
     408============================= 
     409Community Detection in Graphs 
     410============================= 
     411 
     412.. autoclass:: Orange.network.NetworkClustering 
     413   :members: 
    97414 
    98415.. _network-constructor.py: code/network-constructor.py 
     
    130447    .. attribute:: coors 
    131448    
    132         Coordinates for all vertices. They are initialized to random positions.  
    133         You can modify them manually or use one of the optimization algorithms.  
     449        Coordinates for all vertices. They are initialized to random positions. 
     450        You can modify them manually or use one of the optimization algorithms. 
    134451        Usage: coors[0][i], coors[1][i]; 0 for x-axis, 1 for y-axis 
    135452     
Note: See TracChangeset for help on using the changeset viewer.