pyspark.RDD.collectAsMap��

RDD.collectAsMap() → Dict[K, V][source]��

Return the key-value pairs in this RDD to the master as a dictionary.

New in version 0.7.0.

Returns
dict

a dictionary of (key, value) pairs

Notes

This method should only be used if the resulting data is expected to be small, as all the data is loaded into the driver���s memory.

Examples

>>> m = sc.parallelize([(1, 2), (3, 4)]).collectAsMap()
>>> m[1]
2
>>> m[3]
4