You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
on relatively large graphs (n > ~10k) with colocated points, the clique expansion function can get really slow. After cutting it short a few times, I think this for-loop is where things are getting stuck. If that's the bottleneck, I think it should be pretty straightforward to parallelize using something like joblib
The text was updated successfully, but these errors were encountered:
We moved away from it because (a) hashing the points to establish unique locations was expensive and (b) our policy on sorting was different, but I think we can still use pandas join and explode given the new function signature.
The earlier implementation, conceptually:
Built a series indexed on unique locations with values being a list of all the original indices at that location. This used a groupby on the geometries and agg(list) on the ids
Explode the aggregated series
join the exploded series back to the adjacency table, allowing join to handle the one to many expansion for edges entering/leaving cliques.
separately, built the all-pairs links within each clique using a join.
on relatively large graphs (n > ~10k) with colocated points, the clique expansion function can get really slow. After cutting it short a few times, I think this for-loop is where things are getting stuck. If that's the bottleneck, I think it should be pretty straightforward to parallelize using something like joblib
The text was updated successfully, but these errors were encountered: