low efficiency of "from_networkx" compared to TupleList and the other way

I encounter the slow efficiency problem of "from_networkx" when I was trying to use the Leiden algorithm on a very large network which is composed of around 33,000 nodes and 11.3 million edges. I reported the problem in leidenalg GitHub and the author Traag suggested me to report the issue here. So here are my experiments about the three ways to load the network:

"from_networkx":
```python
G1 = ig.Graph.from_networkx(G)
prepartition = la.find_partition(G1, la.RBConfigurationVertexPartition, None, G1.es["weight"], 20, 0, 1, resolution_parameter = inputResolution)
partition_dict = {}
for name, membership in zip(G1.vs["_nx_name"], prepartition.membership):
	partition_dict[int(name)] = int(membership)
```

"TupeList":
```python
G1 = ig.Graph.TupleList(edges, directed=False, vertex_name_attr="name", edge_attrs= ["weight", "estTime"], weights= False)
prepartition = la.find_partition(G1, la.RBConfigurationVertexPartition, None, G1.es["weight"], 20, 0, 1, resolution_parameter = inputResolution)
for name, membership in zip(G1.vs["name"], prepartition.membership):
	partition_dict[int(name)] = int(membership)
```

"the third method":
```python
G1 = ig.Graph(directed = False)
G1.add_vertices(list(set(G.nodes)))
G1.vs["name"] = list(set(G.nodes))
G1.add_edges([(x, y) for (x, y, z, w) in edges])
G1.es['weight'] = [z for (x, y, z, w) in edges]
G1.es['estTime'] = [w for (x, y, z, w) in edges]
prepartition = la.find_partition(G1, la.RBConfigurationVertexPartition, None, G1.es["weight"], 20, 0, 1, resolution_parameter = inputResolution)
for name, membership in zip(G1.vs["name"], prepartition.membership):
	partition_dict[int(name)] = int(membership)
```

The "from_networkx", "TupleList", and "creating a graph by adding vertices and edges" took around 14.7 minutes, 0.82 seconds, and 0.14 seconds accordingly. Such a distinct difference may be stem from the "networkx".

Also, I found the "from_networkx" can load all my nodes no matter the nodes have edges or not, but "TupleList" can only load the nodes that have edges, and the third way can load all nodes in ascending order. 

BTW, I used the `python-igraph` 0.8.3 and I obtained it from your site.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

low efficiency of "from_networkx" compared to TupleList and the other way #401

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

low efficiency of "from_networkx" compared to TupleList and the other way #401

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions