Input/Output tools (dccd.tools.io
)¶
Tools and object to load, append and save differnet kind of database.
-
class
dccd.tools.io.
IODataBase
(path='./', method='csv')¶ Object to save a pd.DataFrame into different kind/format of database.
Parameters: - path : str, optional
Path of the database, default is ‘./’ (current directory).
- method : {‘DataFrame’, ‘SQLite’, ‘CSV’, ‘Excel’, ‘PostgreSQL’, ‘Oracle’, ‘MSSQL’, ‘MySQL’}
Format of database, default is CSV.
Attributes: - path : str
Path of the database.
- method : {‘DataFrame’, ‘SQLite’, ‘CSV’, ‘Excel’, ‘PostgreSQL’, ‘Oracle’, ‘MSSQL’, ‘MySQL’}
Kind/format of the database.
- parser : dict
Values are function to corresponding to method.
Methods
save_as_dataframe
(self, new_data[, name, ext])Append and save new_data as pd.DataFrame binary object. save_as_sql
(self, new_data[, table, name, …])Append and save new_data in SQL database. save_as_sqlite
(self, new_data[, table, …])Append and save new_data in SQLite database. save_as_csv
(self, new_data[, name, ext, …])Append and save new_data in database as CSV format. save_as_excel
(self, new_data[, name, …])Append and save new_data in database as Excel format. __call__
(self, new_data, \*\*kwargs)Append and save new_data in database as method format. -
get_from_dataframe
(self, name, ext='.dat')¶ Get data from pd.DataFrame binary object.
With pickle get as binary pd.DataFrame object.
Parameters: - name : str
Name of the database.
- ext : str, optional
Extension of the database, default is ‘.dat’.
-
get_from_sqlite
(self, name, table='main_table', ext='.db')¶ Get data from SQLite database.
Parameters: - name : str, optional
Name of the database, default is the current year.
- table : str, optional
Name of the table, default is ‘main_table’.
- ext : str, optional
Extension of the database, default is ‘.db’.
- index : bool, optional
Write pd.DataFrame index as a column. Uses index_label as the column name in the table. Default is True.
- index_label : string or sequence, optional
Column label for index column(s). If None is given (default) and index is True, then the index names are used. A sequence should be given if the pd.DataFrame uses pd.MultiIndex.
-
save_as_csv
(self, new_data, name=None, ext='.csv', index=True, index_label=None)¶ Append and save new_data in database as CSV format.
With pickle save as binary pd.DataFrame object, if name database already exists append new_data, otherwise create a new file.
Parameters: - new_data : pd.DataFrame
Data to append to the database.
- name : str, optional
Name of the database, default is the current year.
- ext : str, optional
Extension of the database, default is ‘.csv’.
- index : bool, optional
Write row names (index), default is True.
- index_label : str or sequence, optional
Column label for index column(s) if desired. If not specified (default is None), and index are True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex. If False do not print fields for index names. Use index_label=False for easier importing in R.
-
save_as_dataframe
(self, new_data, name=None, ext='.dat')¶ Append and save new_data as pd.DataFrame binary object.
With pickle save as binary pd.DataFrame object, if name database already exists, load it, append new_data and save it, else create a new database.
Parameters: - new_data : pd.DataFrame
Data to append to the database.
- name : str, optional
Name of the database, default is the current date.
- ext : str, optional
Extension of the database, default is ‘.dat’.
-
save_as_excel
(self, new_data, name=None, sheet_name='Sheet1', ext='.xlsx', index=True, index_label=None)¶ Append and save new_data in database as Excel format.
With pickle save as binary pd.DataFrame object, if name database already exists append new_data, else create a new file.
Parameters: - new_data : pd.DataFrame
Data to append to the database.
- name : str, optional
Name of the database, default is the current date.
- sheet_name : str, optional
Name of sheet which will contain new_data, default is ‘Sheet1’.
- ext : str, optional
Extension of the database, default is ‘.xlsx’.
- index : bool, optional
Write row names (index), default is True.
- index_label : str or sequence, optional
Column label for index column(s) if desired. If not specified (default is None), and index are True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex.
Warning
Slow method, not recommanded for large database.
-
save_as_sql
(self, new_data, table='main_table', name=None, ext='', index=True, index_label=None, driver=None, username=None, password=None, host=None, port=None, **kwargs)¶ Append and save new_data in SQL database.
SQL database as method={‘PostgreSQL’, ‘Oracle’, ‘MSSQL’, ‘MySQL’}. If name already exists append new_data, else create a new database. See SQLAlchemy documentation for more details [1].
Parameters: - new_data : pd.DataFrame
Data to append to the database.
- table : str, optional
Name of the table, default is ‘main_table’.
- name : str, optional
Name of the database, default is the current year.
- ext : str, optional
Extension of the database, default is ‘.db’.
- index : bool, optional
Write pd.DataFrame index as a column. Uses index_label as the column name in the table. Default is True.
- index_label : string or sequence, optional
Column label for index column(s). If None is given (default) and index is True, then the index names are used. A sequence should be given if the pd.DataFrame uses pd.MultiIndex.
- driver : {‘psycopg2’, ‘pg8000’, ‘mysqlclient’, pymysql’, ‘cx_oracle’, ‘pyodbc’, ‘pymssql’}
The name of the DBAPI to be used to connect to the database using all lowercase letters. If not specified, a default DBAPI will be imported if available - this default is typically the most widely known driver available for that backend.
- username, password : str
Username and password to connect to the SQL database.
- host : str, optional
Host to connect, default is ‘localhost’.
- port : str, optional
The port number, default is None.
- kwargs : dict, optional
A dictionary of options to be passed to the dialect and/or the DBAPI upon connect.
References
[1] https://docs.sqlalchemy.org/en/13/core/engines.html
-
save_as_sqlite
(self, new_data, table='main_table', name=None, ext='.db', index=True, index_label=None)¶ Append and save new_data in SQLite database.
With sqlite, if name database already exists append new_data, else create a new data base.
Parameters: - new_data : pd.DataFrame
Data to append to the database.
- table : str, optional
Name of the table, default is ‘main_table’.
- name : str, optional
Name of the database, default is the current year.
- ext : str, optional
Extension of the database, default is ‘.db’.
- index : bool, optional
Write pd.DataFrame index as a column. Uses index_label as the column name in the table. Default is True.
- index_label : string or sequence, optional
Column label for index column(s). If None is given (default) and index is True, then the index names are used. A sequence should be given if the pd.DataFrame uses pd.MultiIndex.
-
dccd.tools.io.
get_df
(path, name, ext='')¶ Load a dataframe as binnary file.
Parameters: - path, name, ext : str
Path to the file, name of the file and the extension of the file.
Returns: - pandas.DataFrame
A dataframe, if file not find return an empty dataframe.
-
dccd.tools.io.
save_df
(df, path, name, ext='')¶ Save a dataframe as a binnary file.
Parameters: - df : pandas.DataFrame
A dataframe to save as binnary file.
- path, name, ext : str
Path to the file, name of the file and the extension of the file.