pyspark.pandas.DataFrame.spark.to_spark_io��

spark.to_spark_io(path: Optional[str] = None, format: Optional[str] = None, mode: str = 'overwrite', partition_cols: Union[str, List[str], None] = None, index_col: Union[str, List[str], None] = None, **options: OptionalPrimitiveType) → None��

Write the DataFrame out to a Spark data source. DataFrame.spark.to_spark_io() is an alias of DataFrame.to_spark_io().

Parameters

pathstring, optional

Path to the data source.

formatstring, optional

Specifies the output data source format. Some common ones are:

��delta��
��parquet��
��orc��
��json��
��csv��

modestr {��append��, ��overwrite��, ��ignore��, ��error��, ��errorifexists��}, default

��overwrite��. Specifies the behavior of the save operation when data already.

��append��: Append the new data to existing data.
��overwrite��: Overwrite existing data.
��ignore��: Silently ignore this operation if data already exists.
��error�� or ��errorifexists��: Throw an exception if data already exists.

partition_colsstr or list of str, optional

Names of partitioning columns

index_col: str or list of str, optional, default: None

Column names to be used in Spark to represent pandas-on-Spark��s index. The index name in pandas-on-Spark is ignored. By default, the index is always lost.

optionsdict

All other options passed directly into Spark��s data source.

Returns

None

See also

read_spark_io
DataFrame.to_delta
DataFrame.to_parquet
DataFrame.to_table
DataFrame.to_spark_io
DataFrame.spark.to_spark_io

Examples

>>>>>> df = ps.DataFrame(dict(
...    date=list(pd.date_range('2012-1-1 12:00:00', periods=3, freq='M')),
...    country=['KR', 'US', 'JP'],
...    code=[1, 2 ,3]), columns=['date', 'country', 'code'])
>>> df
                 date country  code
0 2012-01-31 12:00:00      KR     1
1 2012-02-29 12:00:00      US     2
2 2012-03-31 12:00:00      JP     3

>>>>>> df.to_spark_io(path='%s/to_spark_io/foo.json' % path, format='json')

pyspark.pandas.DataFrame.spark.to_table pyspark.pandas.DataFrame.spark.apply