pyspark.sql.datasource.DataSource#
- class pyspark.sql.datasource.DataSource(options)[source]#
A base class for data sources.
This class represents a custom data source that allows for reading from and/or writing to it. The data source provides methods to create readers and writers for reading and writing data, respectively. At least one of the methods
DataSource.reader()orDataSource.writer()must be implemented by any subclass to make the data source either readable or writable (or both).After implementing this interface, you can start to load your data source using
spark.read.format(...).load()and save data usingdf.write.format(...).save().Methods
name()Returns a string represents the format name of this data source.
reader(schema)Returns a
DataSourceReaderinstance for reading data.schema()Returns the schema of the data source.
simpleStreamReader(schema)Returns a
SimpleDataSourceStreamReaderinstance for reading data.streamReader(schema)Returns a
DataSourceStreamReaderinstance for reading streaming data.streamWriter(schema, overwrite)Returns a
DataSourceStreamWriterinstance for writing data into a streaming sink.writer(schema, overwrite)Returns a
DataSourceWriterinstance for writing data.