Changeset 7381:ab8cb00ecf6a in orange
 Timestamp:
 02/04/11 01:32:36 (3 years ago)
 Branch:
 default
 Convert:
 37fbdb3df91d97b168827615fd0126c4b0370300
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

orange/Orange/network/__init__.py
r7364 r7381 10 10 .. autoclass:: Orange.network.Network 11 11 :members: 12 13 Examples 14 ======== 15 16 Reading and saving a network 17  18 19 This example demonstrates reading a network. Network class can read or write 20 Pajek (.net) or GML file format. 21 22 `networkread.py`_ (uses: `K5.net`_): 23 24 .. literalinclude:: code/networkread.py 25 :lines: 45 26 27 Visualize a network in NetExplorer widget 28  29 30 This example demonstrates how to display a network in NetExplorer. 31 32 part of `networkwidget.py`_ 33 34 .. literalinclude:: code/networkwidget.py 35 :lines: 1016 36 37 .. image:: files/networkexplorer.png 38 :width: 100% 12 39 13 40 =========================== … … 18 45 :members: 19 46 :excludemembers: collapse 20 21 ============================= 22 Community Detection in Graphs 23 ============================= 24 25 .. autoclass:: Orange.network.NetworkClustering 26 :members: 27 28 29 ======== 47 30 48 Examples 31 49 ======== 32 50 33 51 Network constructor and random layout 34 ===================================== 52  35 53 36 54 In our first example we create a Network object with a simple full graph (K5). … … 49 67 50 68 Network layout optimization 51 =========================== 69  52 70 53 71 This example demonstrates how to optimize network layout using one of included … … 71 89 72 90 .. image:: files/networkK5fr.png 73 74 Reading and saving a network 75 ============================ 76 77 This example demonstrates reading a network. Network class can read or write 78 Pajek (.net) or GML file format. 79 80 `networkread.py`_ (uses: `K5.net`_): 81 82 .. literalinclude:: code/networkread.py 83 :lines: 45 84 85 Visualize a network in NetExplorer widget 86 ========================================= 87 88 This example demonstrates how to display a network in NetExplorer. 89 90 part of `networkwidget.py`_ 91 92 .. literalinclude:: code/networkwidget.py 93 :lines: 1016 94 95 .. image:: files/networkexplorer.png 96 :width: 100% 91 92 ====== 93 Graphs 94 ====== 95 96 Orange offers a data structure for representing directed and undirected graphs 97 with various types of weighted connections. 98 99 Basic graphs have only one type of edges. Each edge can have an associated 100 number, representing a strength of the edge  with whatever underlying 101 physical interpretation. Orange's graphs are more general and two vertices 102 can be connected by edges of various types. One use for this would be in 103 genetics, where one gene can excite or inhibit another  or both 104 simultaneously, which is why we can't simply assign negative numbers to the 105 edges. The number of edge types is unlimited, but needs to be set in advance. 106 107 Before constructing a graph, you will also need to decide for the underlying 108 data structure. The differences for smaller graphs (e.g. with less than 109 100 nodes) should be insignificant, while for the larger, the decision should 110 be based upon the expected number of edges ("density" of the graph) and 111 the operations you plan to execute. Graphs with large number of edges 112 (eg.>n2/4, where n is the number of vertices) should be represented with 113 adjacency matrices (class GraphAsMatrix), graphs with small number of edges 114 with adjacency lists (GraphAsList) and those in between with adjacency trees 115 (GraphAsTrees). Regarding the speed, matrices are generally the fastest, while 116 some operations, such as finding all edges leading from certain node, will 117 sometimes be faster with lists or trees. 118 119 One thing that is not supported (at the moment?) are multiple edges of the 120 same type between two vertices. 121 122 Construction 123 ============ 124 125 When constructing a graph, you will need to decide about the data structure for 126 representation of edges, and call the corresponding constructor. All 127 constructors take the same arguments: the number of vertices (needs to be given 128 in advance, you cannot add additional vertices later), a flag telling whether 129 the graph is directed or not, and the number of edge types. The default number 130 of edge types is 1 (a normal graph), while the other two arguments are 131 mandatory. 132 133 You can choose between three constructors, all derived from a single ancestor 134 Graph: 135 136 .. class:: GraphAsMatrix(nVertices, directed[, nEdgeTypes]) 137 138 Edges are stored in a matrix with either n2 or n(n+1)/2 elements, depending 139 upon whether the graph is directed or not. (In C++, it is stored as float * 140 pointing to an array of length n*n*nEdgeTypes or (n*(n+1))/2*nEdgeTypes 141 elements, where nEdgeTypes is the number of edge types.) This 142 representation is suitable for smaller graphs and for dense large graphs. 143 For graph with only one edge type, this representation is more economical 144 than representation with lists or trees when the number of edges is larger 145 than n2/4. Inserting, deleting and checking the edges is fast; listing the 146 neighbours of a certain node is fast unless the graph is sparse, in which 147 case a graph represented with a list or a tree would be faster. 148 149 .. class:: GraphAsList(nVertices, directed[, nEdgeTypes]) 150 151 Edges are stored in an ordered lists of neighbours, one list for each node. 152 In C++, for each neighbour, the connection is stored in a structure with 153 the vertex number (int), a pointer to the next structure, and an array of 154 floats, one for each integer. With 16byte alignment, this would take 16 155 bytes for graphs with one or two edge types on the usual 32bit platforms. 156 For undirected graphs, each edge is stored only once, in the list of the 157 edge with the smaller index. This makes the structure smaller and insertion 158 and lookup faster; it slows down finding the neighbours of a given node. 159 This structure is convenient for graphs with a very small number of edges. 160 For them, inserting and removing edges is relatively fast; getting all 161 edges leading from a vertex is fast, while getting edges leading to a 162 vertex or getting all neighbours (in directed or undirected graph) is slow. 163 164 .. class:: GraphAsTree(nVertices, directed[, nEdgeTypes]) 165 166 This structure is similar to GraphAsTree except that the edges are stored 167 in trees instead of lists. This should be a structure of choice for all 168 graph between really sparse and those having one quarter of possible edges. 169 As expected, queries are fast, while insertion and removal of edges is 170 somewhat slower (though still faster than for GraphAsList unless the number 171 of edges is really small). Internally, nodes of the tree contain the vertex 172 number, two pointers and a list of floats. With one edge type, this should 173 be 16 bytes on 32bit platforms. An ordinary undirected graph with 10 174 vertices stored in a matrix would thus be constructed by 175 176 Examples 177  178 179 graph = orange.GraphAsMatrix(10, 0) 180 181 A directed graph with 1000 vertices and edges of three types, stored with 182 adjacency trees would be constructed by 183 184 graph = orange.GraphAsTree(1000, 1, 3) 185 186 Usage 187 ===== 188 189 All three graph types are used the same way, independent of the underlying 190 structure. All methods are defined in basic class :obj:Orange.network.Graph. 191 192 .. class:: Graph 193 194 .. attribute:: nVertices 195 196 The number of vertices (readonly, set at construction) 197 198 .. attribute:: nEdgeTypes 199 200 The number of different edge types (readonly, set at construction) 201 202 .. attribute:: directed 203 204 Tells whether the graph is directed (readonly, set at construction) 205 206 .. attribute:: objects 207 208 A dictionary, list or other sequence of objects that correspond to 209 graph nodes. The use of this object is described in section on 210 indexing. 211 212 .. attribute:: forceMapping 213 214 Determines whether to map integer indices through 'objects'. Details are 215 described below. 216 217 .. attribute:: returnIndices 218 219 If set, the methods that return list of neighbours will return lists of 220 integers even when objects are given. 221 222 .. Basic Operations 223 ..  224 225 .. Indexing 226 ..  227 228 Vertices are referred to by either integer indices or Python objects of any 229 type. In the latter case, a mapping should be provided by assigning the 230 'objects' attribute. For instance, if you set graph.objects to ["Age", 231 "Gender", "Height", "Weight"] then graph["Age", "Height"] would be equivalent 232 to graph[0, 2] and graph.getNeighbours("Weight") to graph.getNeighbours(3). 233 Vertex identifier doesn't need to be a string, , it can be any Python 234 object. 235 236 If objects contains a dictionary, its keys are vertex identifiers and the 237 values in the dictionary should be integers, eg. 238 239 graph.objects = {} 240 graph.objects["Age"] = 0 241 graph.objects[None] = 1 242 graph.objects[orange] = 4 243 244 If not a dictionary, objects can be any kind of sequence. Usually, you will 245 give it a list of the same length as the number of vertices in the graph, 246 so each element would identify a vertex. When indexing, the index is sought 247 for in the list. objects can also be a list of attributes, a domain, an 248 example table or even a single string; Orange will run a code equivalent to 249 "for o in graph.objects", so everything for which such a loop works, goes. 250 251 Searching through the list is, of course, rather slow, so it is recommended 252 to use integer indices for larger graphs. So, if you request 253 graph.getNeighbours(0), the method will return the neighbours of the first 254 vertex, even if objects is given. But  what if you want to map all 255 indices, even integers, through objects? In this case, you need to set 256 graph.forceMapping to 1. If graph.forceMapping is set and graph.objects is 257 given, even getNeighbours(0) will search the graph.objects for 0 and return 258 the neighbours of the corresponding (not necessarily the first) node. 259 260 .. Getting and Setting Edges 261 ..  262 263 .. method:: graph[v1, v2] 264 265 For ordinary graphs with a single edge type, graph[v1, v2] returns the 266 weight of the edge between v1 and v2, or None if there is no edge 267 (edge's weight can also be 0). Edges can also be set by assigning them 268 a weight, e.g.graph[2, 5]=1.5. As described above, if objects is set, 269 we can use other objects, such as names, as v1 and v2 (the same goes 270 for all other functions described below). For graphs with multiple edge 271 types, graph[v1, v2] returns a list of weights for various edge types. 272 Some (or all, if there is no edge) elements of the list can be None. If 273 the edge does not exist, graph[v1, v2] returns a list of Nones, not a 274 None. You can assign a list to graph[v1, v2]; in graph with three edge 275 types you can set graph[2, 5] = [1.5, None, 2.0]. After that, there 276 are two edges between vertices 2 and 5, one of the first type with 277 weight 1.5, and one of the third with weight 2.0. To remove an edge, 278 you can assign it a least of Nones or a single None, e.g. graph[2, 279 5]=None; this removes edges of all types between the two nodes. The 280 list returned for graphs with multiple edge types is actually a 281 reference to the edge, therefore you can set e = graph[v1, v2] and then 282 manipulate e, for instance e[1]=10 or e[0]=None. Edge will behave just 283 as an ordinary list (well, almost  no slicing ets). However, don't try 284 to assign a list to e, eg e=[1, None, 4]. This assigns a list to e, not 285 to the corresponding edge... graph[v1, v2, type] This is defined only 286 for graph with multiple edge types; it returns the weight for the edge 287 of type type between v1 and v2, or None if there is no such edge. You 288 can also establish an edge by assigning a weight (e.g. graph[2, 5, 2] = 289 2.0) or remove it by assigning it a None (graph[2, 5, 2] = None). 290 edgeExists(v1, v2[, type]) Returns true if the edge between v1 and v2 291 exists. For multiple edge type graphs you can also specify the type of 292 the edge you check for. If the third argument is omitted, the method 293 returns true if there is any kind of edge between the two vertices. It 294 is recommended to use this method when you want to check for a node. In 295 single edge type graphs, if graph[v1, v2]: will fail when there is an 296 edge but it has a weight of zero. With multiple edge types, if 297 graph[v1, v2]: will always success since graph[v1, v2] returns a non 298 empty list; if there is no edges, this will be a list of Nones, but 299 Python still treats it as "true". addCluster(list_of_vertices) Creates 300 a cluster  adds edges between all listed vertices. Queries 301 302 Graph provides a set of function that return nodes connected to a 303 certain node. 304 305 .. method:: getNeighbours(v1[, type]) 306 307 Returns all the nodes that are connected to v1. In directed graphs, 308 this includes vertices with edges toward or from v1. In graphs with 309 multiple edge types you can also specify the edge type you are 310 interested in: getNeighbours will the return only the vertices that are 311 connected to v1 by edges of that type. 312 313 .. method:: getEdgesFrom(v1[, type]) 314 315 Return all the vertices which are connected to v1 by the edges leading 316 from v1. In edges with multiple edge types, you can specify the edge 317 type. In undirected graph, this function is equivalent to 318 getNeighbours. 319 320 .. method:: getEdgesTo(v1[, type]) 321 322 Returns all the vertices with edges leading to v1. Again, you can 323 decide for a single edge type to observe, and, again again, in 324 undirected graphs this function is equivalent to getNeighbours. 325 326 If objects is set, functions return a list of objects (names of 327 vertices or whatever objects you stored in objects). Otherwise, a list 328 of integer indices is returned. If you want to force Graph to return 329 integer indices even if objects is set, set graph.returnIndices to 330 True. 331 332 Of the three operations, the expensive one is to look for the vertices 333 with edges pointing to the given edge. There is no problem when graph 334 is represented as a matrix (graphAsMatrix); these are always fast. On 335 directed graph, getEdgeFrom is always fast as well. 336 337 In undirected graphs represented as lists or trees, the edge between 338 vertices with indices v1 and v2 is stored at the list/tree in the 339 smaller of the two indices. Therefore to list all neighbours of v1, 340 edges with v2<v1 are copied form v1's list, while for edges with v2>v1 341 the function needs to look for v1 in each v2's list/tree. Lookup in 342 trees is fast, while in representation with adjacency list, the 343 function is slower for v1 closer to nVertices/2. If v1 is small there 344 is a great number of v2>v1 whose lists are to be checked, but since the 345 lists are ordered, v1 is more to the beginning of these lists (and when 346 a vertex with index higher than v1 is encountered, we know that v1 is 347 not on the list). If v2 is great, there it is more toward the end of 348 the list, but there is smaller number of lists to be checked. 349 Generally, the average number of traversed list elements for 350 getNeighbours/getEdgesFrom/getEdgesTo on undirected graphs with 351 p*nVertices2 edges is p(nVerticesv1)v1. 352 353 To sum up, if you have a large undirected graph and intend to query for 354 neighbours (or, equivalently, edges to or from a node) a lot, don't use 355 GraphAsList. If the graph is small or you won't use these functions, it 356 doesn't matter. 357 358 For directed graphs, getEdgesFrom is trivial. The other two functions 359 are even slower than for undirected graphs, since to find the edges 360 leading from any vertex to a given one, all lists/trees need to be 361 searched. So, if your algorithm will extensively use getEdgesTo or 362 getNeighbours and your graph is large but the number of edges is less 363 than nEdges2/2, you should use GraphAsTree or, to be faster but consume 364 more memory store the graph as GraphAsMatrix. If the number of edges is 365 greater, GraphAsMatrix is more economic anyway. (This calculation is 366 for graph with only one edge type; see the description of graph types 367 for details. 368 369 However, this is all programmed in C++, so whatever you do, the 370 bottleneck will probably still be in your Python code and not in C++. 371 You probably cannot miss by using GraphAsTree disregarding the size of 372 the graph and the operations you perform on it. 373 374 .. Graph Analyses 375 ..  376 377 .. method:: getConnectedComponents() 378 379 Returns list of all connected components sorted descending by component 380 size. 381 382 .. method:: getDegreeDistribution() 383 384 Returns degree distribution as dictionary of type 385 {degree:number_of_vertices}. 386 387 .. method:: getDegrees() 388 389 Returns list of degrees. List size matches number of vertices. Index of 390 given degree matches index of corresponding vertex. 391 392 .. method:: getHubs(n) 393 394 Returns list of n largest hubs. 395 396 .. method:: getShortestPaths(u, v) 397 398 Returns list of vertices in the shortest path between u and v. 399 400 .. method:: getDistance(u, v) 401 402 Returns a distance between vertices u and v. 403 404 .. method:: getDiameter() 405 406 Returns a diameter of the graph. 407 408 ============================= 409 Community Detection in Graphs 410 ============================= 411 412 .. autoclass:: Orange.network.NetworkClustering 413 :members: 97 414 98 415 .. _networkconstructor.py: code/networkconstructor.py … … 130 447 .. attribute:: coors 131 448 132 Coordinates for all vertices. They are initialized to random positions. 133 You can modify them manually or use one of the optimization algorithms. 449 Coordinates for all vertices. They are initialized to random positions. 450 You can modify them manually or use one of the optimization algorithms. 134 451 Usage: coors[0][i], coors[1][i]; 0 for xaxis, 1 for yaxis 135 452
Note: See TracChangeset
for help on using the changeset viewer.