lwetl - Light Weight Extraction Transform Load tool¶
Introduction¶
The module lwetl is a light-weight database client to transfer data between various databases, or inbetween tables of the same database.
It is intended as a administrative-, or development-tool to script quick modifications to an existing database. It uses python 3 in combination with the JayDeBeApi module and JDBC jar files.
Typical usage¶
- extract data from a database, either through a command-line interface, or through python classes.
- upload table rows into a target database.
- transfer of (modified) data to files or pipes in common formats such as: text, csv, xml, xlsx, or sql.
- extract or upload binary data (not supported by all JDBC drivers).
Key features¶
- A centralized configuration file for database connections:
- choice of the JDBC driver.
- definitions of the JDBC connection URLs.
- optionally parsing of ORACLE’s tnsnames.ora for access through JDBC thin client.
- optionally saving database access credentials as an alias. The password in these credentials may be encrypted with a master password.
- Direct command-line access to a database for upload and download.
- Command-line transfer of tables between independent database instances, possibly of a different server-type.
- Python classes for encapsulated transfer of data.
Due to its nature, the tool is suited for small- or medium-sized transformation-, import-, or extraction-tasks (a throughput rate up to 4000 records per second).
Multi-threading of the database connection is not supported.
Requirements¶
- A python 3 environment with permission to install modules (system-wide or as virtual environment).
- A Java 1.7+ runtime environment
- Write-access to the user home-directory (a configuration directory $HOME/.lwetl is auto-created).
Status¶
This project is in a pre-pre-alpha stage. It has been tested with drivers for mysql, sqlserver, oracle, postgresql, and sqlite:
- Linux Mint (Ubuntu, Debian) with python 3.5.
- CentOS Linux (Redhat, Fedora) with python 3.4.
- Windows 10 Home with Anaconda python 3.6.
Warning
some database serves (e.g., MySQL) may make a distincion between upper-case and lower-case table-names and/or column-names. This might cause errors, since all current tests have been performed in environments where such a distinction does not exist.
Installation¶
The module depends on Jpype1 and optionally regex. Both need access to a compiler for installation, if installed with pip
.
- Note:
- the regex module is used to parse the ORACLE connection configuration (
tnsnames.ora
). If you do not intend to access ORACLE through the settings of this file, the module may be ignored.
Operating Systems¶
Linux¶
The module may in installed in a python virtual environment, for example like:
virtualenv --no-site-packages -p /usr/bin/python3 $HOME/my_virtual_envs/jdbc
source $HOME/my_virtual_envs/jdbc/bin/activate
The module can be installed with pip
from github:
pip install git+https://github.com/rene-bakker-it/lwetl.git
Alternatively the repository may first be cloned:
git clone https://github.com/rene-bakker-it/lwetl.git
cd lwetl
pip install .
Windows¶
Note 1: The module depends on Java. Make sure that the JVM and Python are both of the same type, eiter 32 bits, or 64 bits. Only the 64 bits version has been tested:
Note 2: the lwetl package depends on the module cryptography, which depends on openSSL.
pip install git+https://github.com/rene-bakker-it/lwetl.git
pip install regex
Dependencies¶
The module depends on the following packages:
- et-xmlfile,
- JayDeBeApi,
- jdcal,
- Jpype1,
- openpyxl,
- psutil,
- PyYAML,
- cryptography, and
- regex (optionally).
Tests in the tests
directory are based on pytest, whichalso requires: pytest-html, pytest-metadata, and pytest-progress.
Documentation in the docs
directory is based on Sphinx and the read the docs theme.
Developers, who want to use the utility function set-version.py
in the main directectory of the source code, should also install GitPython.
Examples and use¶
General¶
Make sure the java JRE (or JDK) are known to the system. If this is not the case, add JAVA_HOME
to the system environemnt, or specify it in the env
section of hte configuration file (see below).
If successful the command sql-query list should run without errors. You may see messages like:
INFO: ojdbc6-12.1.0.1-atlassian-hosted.jar downloaded to: ./jdbc/lib/python3.4/site-packages/lib
INFO: postgresql-9.4.1208-jdbc42-atlassian-hosted.jar downloaded to: ./jdbc/lib/python3.4/site-packages/lib
INFO: sqlite-jdbc-3.21.0.jar downloaded to: ./jdbc/lib/python3.4/site-packages/lib
INFO: mysql-connector-java-5.1.39.jar downloaded to: ./jdbc/lib/python3.4/site-packages/lib
INFO: mssql-jdbc-6.3.5.jre8-preview.jar downloaded to: ./jdbc/lib/python3.4/site-packages/lib
These are downloads, which typically take place once only. The origin of these file may be found in:
$HOME/.lwetl/config-example.yml
To connect to a database, server definitions must be added to the YAML file $HOME/.lwetl/config.yml
, see the file $HOME/.lwetl/config-example.yml
for some examples.
Invocation from the command-line¶
A correctly configured connection may then be used like:
sql-query <username/password@server> "SQL statement"
or with a configured alias:
sql-query <alias> "SQL statement"
Implemented command line options are (use the -h option for help):
- lwetl-security
- to encrypt/decrypt the passwords in the alias with a master password.
- sql-query
- as a general purpose command-line sql parser, up-loader, or down-loader.
- table-cardinality
- to dump cardinality data of a table into an xlsx spreadsheet.
- db-copy
- to copy entire tables between database instances.
Alternatively they may be invoced as a module, for example:
python -m lwetl.programs.sql_query.main
Invocation inside python¶
The repository directory examples
is intended for a collection of simple example scripts. The module directories lwetl/programs
provides more advanced examples.
An example to dump binary images from the database into the current directory:
from lwetl import Jdbc
jdbc = Jdbc('login alias')
for fname, img in jdbc.query("SELECT file_name, image_field FROM MY_TABLE"):
with open(fname,'wb') as f:
f.write(img)
Module components¶
Index¶
The configuration file (config.yml)¶
Locations¶
The configuration-file (config.yml
) may be stored at the following locations:
- in the root of the module (always present),
- for Linux systems in the system directory
/etc/lwetl/config.yml
, - in the user-home directory
$HOME/.lwetl/config.yml
(auto-created), - in the current directory
Upon invocation the program scans the locations in the order given above, and identical definitions are successively overwritten.
Format¶
The configuration must in yaml markup format and may contain any of the following sections:
- env:
- a section to specify or change system environment variables
- drivers:
- to specify the used jdbc drivers and their configuration
- servers:
- to specify database servers and the database schemes (instances)
- alias:
- containing access credentials and references to database servers, which were specified in the
servers
section. - encrypt: (true|false)
- to specify if the passwords in the alias should be encrypted with a master password.
Note: access credentials in the alias section may stored in plain text. If security is an issue, you have the following options:
- make sure that the configuration file is properly read-protected.
- use the lwetl-security program to encrypt the passwords with a master password. By default the master password is asked each time you open a database connection. As an alternative it may may be stored in the environment variable LWETL (less secuure).
- If you do not create aliases the username and password must be entered in the appropriate methods.
Encryption with a master password¶
# (re) encrypt the home configuration file with a password
# 1. make sure the master password has to be entered on the command line:
unset LWETL
# 2. (re) encrypt
# the current and new password will be asked on the command line
lwetl-security -c ~/.lwetl/config.yml
# remove encryption
lwetl-security -r ~/.lwetl/config.yml
Example¶
# user defined environment variables
env:
ORACLE_HOME: /usr/lib/oracle/12.1/client64
TNS: /usr/lib/oracle/12.1/client64/network/admin/tnsnames.ora
# jdbc drivers identified by a type (used as type in the next section)
#
# required parameters:
# - jar: url to download the jdbc jar file if not found on the module library
# may either be an url or a fixed reference to a file on the file-system
# - class: name of the class to use in the jar file
# - url: start of the connection url, will be extended with the url defined in the section 'servers'
#
# optional parameters:
# - attr: additional attributes to add at the end of the generated connection url
# - escape: boolean - if set to true, all column names will be escaped in the uploader routines. Permits the
# use of reserved words as column names.
# WARNING: not implemented for postgresql
#
# WARNING: the strings used to define the driver types below are also used in the python code and should not be changed.
drivers:
sqlserver:
# binary upload for jtds driver not supported
jar: 'http://central.maven.org/maven2/com/microsoft/sqlserver/mssql-jdbc/6.3.5.jre8-preview/mssql-jdbc-6.3.5.jre8-preview.jar'
class: 'com.microsoft.sqlserver.jdbc.SQLServerDriver'
url: 'jdbc:sqlserver://'
mysql:
jar: 'http://central.maven.org/maven2/mysql/mysql-connector-java/5.1.39/mysql-connector-java-5.1.39.jar'
class: 'com.mysql.jdbc.Driver'
url: 'jdbc:mysql://'
attr: '?autoReconnect=true&useSSL=false'
escape: true
oracle:
jar: 'https://maven.atlassian.com/3rdparty/com/oracle/ojdbc6/12.1.0.1-atlassian-hosted/ojdbc6-12.1.0.1-atlassian-hosted.jar'
class: 'oracle.jdbc.OracleDriver'
url: 'jdbc:oracle:thin:@'
postgresql:
# jar: 'http://central.maven.org/maven2/org/postgresql/postgresql/42.1.4.jre7/postgresql-42.1.4.jre7.jar'
jar: 'https://maven.atlassian.com/3rdparty/postgresql/postgresql/9.4.1208-jdbc42-atlassian-hosted/postgresql-9.4.1208-jdbc42-atlassian-hosted.jar'
class: 'org.postgresql.Driver'
url: 'jdbc:postgresql://'
# servers
# defines database servers on the schema (instance) level
#
# required parameters:
# - type - must be one of the types defined in drivers
# - url - connection url. The complete url is <url_driver><url server><attr driver>
#
# NOTE: for ORACLE additional server names may be obtained from the file tnsnames.ora
servers:
scott_mysql:
type: mysql
url: "192.168.7.33:3306/scott"
scott_postgresql:
type: postgresql
url: "172.56.11.41:5432/scott"
scott_sqlserver:
type: sqlserver
url: '172.56.11.41\scott:1534'
# alias for connections, in ORACLE credentials format
# <username>/<password>@<servername>
encrypted: false
alias:
scott_mysql: "scott/tiger@scott_mysql"
scott_postgresql: "scott/tiger@scott_postgresql"
scott_sqlserver: "scott/tiger@scott_sqlserver"
scott_oracle: "scott/tiger@scott_oracle"
scot: "scot/xxxxxxxx@tns_entry"
Sections¶
env - environment¶
Function:
- specify the jre/jdk for the jdbc drivers
- specify the location of ORACLE configurations
By default this section is empty.
Example
env:
# Windows
JAVA_HOME: 'C:\Progra~1\Java\jre1.8.0_65'
ORACLE_HOME: 'C:\Oracle\product\11.2.0'
# Linux
TNS: /usr/lib/oracle/12.1/client64/network/admin/tnsnames.ora
- Note 1:
- if only ORACLE_HOME is specified, the program will search for the file
$ORACLE_HOME/network/admin/tnsnames.ora
. If also TNS is specified, the program will first look at the location specified by$TNS
. Only if this section is not found, it will look at the previous location. - Note 2:
- On Windows 64-bit systems:
Progra~1 = 'Program Files'
Progra~2 = 'Program Files(x86)'
drivers - Jdbc driver definitions¶
Function - associate a unique tag to a database server type:
- specify a source location of a jdbc jar file (url or file)
- specify the jdbc driver class of the jar file
- specify the base of the connection url
servers - Database server definitions¶
Function - associate a unique tag to a database connection:
- the driver driver used (see previous section)
- main connection URL specifying: - the IP address of the database server - the scheme/instance of the database
The main connection¶
The class Jdbc
creates a connection to a database, which remains open until the object isdestroyed.
-
class
Jdbc
(login, auto_commit=False, upper_case=True)¶ Creates a connection.
Raises
an exception if the connection fails, see the example below.Parameters: - login (str) –
login alias defined in
config.yml
, or authentication credentials like:The parser assumes that the name of the service does not contain the ‘@’ character. The password should not contain a ‘/’.
- auto_commit bool (bool) – specifies the auto-commit mode of the connection at startup. Defaults to False (auto-commit disabled).
- upper_case (bool) – specifies if the column names of SQL queries are converted into upper-case. Convenient if the result of queries is converted into dictionaries.
Example:
from lwetl import Jdbc, ServiceNotFoundException, DriverNotFoundException # create a connection to 'scott' with password 'tiger' to oracle server 'osrv1' # (as defined in tnsnames.ora) try: jdbc = Jdbc('scott/tiger@osrv01') except (ServiceNotFoundException, DriverNotFoundException) as nfe: print('ERROR - could initialize: ' + str(nfe)) except ConnectionError as ce: print('ERROR - failed to connect: ' + str(ce))
-
connection
¶ The connection to the database as returned by
jaydebeapi.connect
. See PEP249 for further details.
-
execute
(sql: str, parameters: (list, tuple) = None, cursor: Union[Cursor, None] = None, use_current_cursor: bool = True, keep_cursor: bool = False) → Cursor:¶ Execute a query, optionally with list of parameters, or a list of a list of parameters.
Raises
an SQLExecutionException if the query fails, see the example below.Parameters: - sql (str) – query to execute
- parameters (tuple,list,None) – parameters specified in the sql query. May also be None (no parameters), or a list of lists
- cursor (Cursor,None) – the cursor to use for execution. Auto-generate, if not specified. For details see the next function arguments.
- use_current_cursor (bool) – if the cursor is not specified, it will try to use the last-used cursor of the same life-cycle type. See the next argument for details. Defaults to True.
- keep_cursor (bool) – life-span of the cursor: keep it on a commit or rollback. Defaults to False. It is intended to permit long-lived cursors, when a large number of rows are read from the database, which triggers write- operations on a different cursor, possibly with multiple commits.
Returns: a
jaydebeapi.connect.Cursor
for further processing.Example:
from lwetl import Jdbc, SqlExecutionException jdbc = Jdbc('scott/tiger@osrv01') try: cur = jdbc.execte("INSERT INTO TST_NAMES (ID, USERNAME) VALUES (17,'scott')") except SQLExectionException as sqle: print('ERROR - could not execute: ' + str(sqle))
-
close(cursor=None):
Closes the specified cursor. Use the current if not specified. Cursors which are aleady closed, or are not associated to the jdbc conection are silently ignored.
-
get_columns
(cursor=None) → OrderedDict:¶ Parameters: cursor (Cursor) – the cursor to query. Uses the last used (current) cursor, if not specified. Returns: the column associated to the cursor as an OrderedDict, or an empty dictionary if no columns were found.
-
commit():
Commits pending modifications of the specified cursor to the database. Commits and invalidates all cursors with pending commits.
Use appropriate arguments in the
execute()
command, if you intend to generate read-only queries, which must extend over multiple commits to the database. It is recommended to use thequery()
command for this purpose.
-
rollback():
Rolls back pending modifications to the database. Cancels and invalidates all cursors with pending commits.
-
get_data
(cursor: Cursor = None, return_type=tuple, include_none=False, max_rows: int = 0, array_size: int = 1000) → iterator:¶ Get the data retrieved from a
execute()
command.Parameters: - cursor (Cursor) – cursor to query, use current if not specified
- return_type (Any) – the return type of the method. Defaults to
tuple
. Other valid options arelist
,dict
,OrderedDict
, or a (tuple of) stings. In case of the latter, the output is casted to the specified types. Supported types areAny
(no casting),str
,int
,bool
,float
,date
, or a format string compatible with :class:’datetime.strptime()’. The format string for ‘date’ is ‘%Y-%m-%d [%H:%M:%S]’. If a single string is specified, the returned row will only be the first value of each row. Otherwise the output is a tuple of values with a maximum length of the specified input tuple. This option is particularly useful for connections to a sqlite, where the auto-casting casting of the types in the jdbc driver may fail. - include_none (bool) – if set to
True
, also returnsNone
values in dictionaries. Defaults toFalse
. Fortuple
, orlist
, all elements are always returned. - max_rows (int) – maximum number of rows to return before closing the cursor. Negative or zero implies all rows
- array_size (int) – the buffer size to retrieve batches of data.
Returns: an iterator with rows of data obtained from an SQL with the data-type specified with the return_type parameter.
-
query
(sql: str, parameters=None, return_type=tuple, max_rows=0, array_size=1000) → iterator:¶ Combines the
execute()
andget_data()
into a single statement.
-
query_single
(sql: str, parameters=None, return_type=tuple) → (tuple, list, dict, OrderedDict):¶ Returns: only the first row from query()
-
query_single_value(sql: str, parameters=None):
Returns: the first column from query_single()
-
get_int(sql: str, parameters=None):
A short-cut for:
int(query_single_value(sql, parameters))
- login (str) –
Exceptions¶
-
class
SQLExcecuteException
¶
Raised when an execute()
command cannot be parsed.
-
class
ServiceNotFoundException
¶
Raised when a database connection cannot be reach the database server.
-
class
DriverNotFoundException
¶
Raised when the jdbc driver, associated to the database connection, cannot be retrieved.
-
class
CommitException
¶
Raised when a commit()
command fails.
Utility functions and classes¶
-
class
JdbcInfo
(login: str)¶ Displays parameter information of the jdbc driver.
Parameters: login (str) – login alias defined in config.yml
, or authentication credentials.
Example:
from lwetl JdbcInfo
jdbc_info = JdbcInfo('scott')
jdbc_info()
-
get_execution_statistics
() → str¶ Retrieves some timing statistics on the established connections.
Return type: multi-line string
-
tag_connection
(tag:str, jdbc:Jdbc)¶ Marks specific connections, such that the function
get_execution_statistics()
provides more detail.Parameters: - tag (str) – a tag for a connection
- jdbc (Jdbc) – an established database connection
Example:
from import Jdbc, get_execution_statistics, tag_connection jdbc = { 'SRC': Jdbc('scott_source'), 'TRG': Jdbc('scott_target') } for tag, con in jdbc.items(): tag_connection(tag,con) # do lots of work print(get_execution_statistics())
Import into a database¶
Import comes either as a sql import reader or as objects to read tablels of data.
SQL Import¶
-
InputParser(sql_or_filename_or_stream=None, sql_terminator:str=';'):
Class to parse SQL input, either from file or from a stream (e.g., stdin) Assumes that all SQLs are terminated with an sql_terminator character (defaults to a semi-colon) at the end of a line.
Warning
This class may fail on multi-line string inputs that use the same character and the and of a CRLF.
Example:
import sys from lwetl Jdbc, InputParser jdbc = Jdbc('scott/tiger@osrv01') with InputParser(sys.stdin) as parser: for sql in parser.parse(): jdbc.execute(sql)
-
set_sql_terminator(sql_terminator):
Specifies the SQL terminator.
Parameters: sql_terminator (str) – The specified SQL terminator -
open(sql_fn_stream=None):
Opens a TextIOWrapper for input
Parameters: sql_fn_stream (TextIOWrapper) – the stream to open
-
close():
-
Closes the input stream
-
parse
(array_size=1) → iterator:¶ parses the input stream.
Parameters: array_size (int) – buffer size of the iterator Returns: an iterator of SQL commands
-
Data Readers¶
-
class
CsvImport
(filename_or_stream=None, delimiter="t", encoding='utf-8')¶ Open a CSV file and extract data by row in the form of a dictionary Expects the first row if the dictionary to contain the column names.
Parameters: - filename_or_stream – if the argument is a string, the program will try to open the file with this name. For streams, it will use the stream as-is. Defaults to the stdin.
- delimiter (str) – specifies the column delimiter of the CSV file. Defaulst to the TAB character.
- encoding (str) – character encoding to use for the input file. Defaults to utf-8
-
open
(filename_or_stream=None, delimiter=None, encoding=None)¶ Parameters: - filename_or_stream – specifies the input. If not specified, it takes the specifier when the class object was created.
- delimiter (str) – specifies the column deliter. If not specified, it takes the delimiter specified when the object was created.
- encoding (str) – specifies the character encoding. If not specified, it takes the encoding specified when the object was created.
-
close():
closes the input stream. Only has an effect, if the input was specified as a filename.
-
get_data
(max_rows=1000) → iterator¶ Parameters: max_rows (int) – retrieve the data as an generator/iterator. The parameter specifies the buffer size.
-
class
LdifImport
(filename_or_stream=None, separator=None, encoding='utf-8')¶ Open a LDIF file and extract data as a dictionary with the attribute names as keys.
Parameters: - filename_or_stream – if the argument is a string, the program will try to open the file with this name. For streams, it will use the stream as-is. Defaults to the stdin.
- separator (str) – in an ldif file, an attibure may occur multiple times in the same record. In such cases the value of the dictionary becomes a list. In the case the separator is specified, this list is transformed into a string, separating the elements with the specified separator.
- encoding (str) – character encoding to use for the input file. Defaults to utf-8
-
open
(filename_or_stream=None, separator=None, encoding=None)¶ opens the file or stream of input.
Parameters: - filename_or_stream – specifies the input. If not specified, it takes the specifier when the class object was created.
- separator (str) – in an ldif file, an attibure may occur multiple times in the same record. In such cases the value of the dictionary becomes a list. In the case the separator is specified, this list is transformed into a string, separating the elements with the specified separator.
- encoding (str) – specifies the character encoding. If not specified, it takes the encoding specified when the object was created.
-
close():
closes the input stream. Only has an effect, if the input was specified as a filename.
-
get_data
() → iterator¶ retrieve the data as an generator/iterator. The parameter specifies the buffer size.
-
class
XlsxImport
(self, file_name: str, sheet_name: str = None)¶ Open an xls worksheet and extract the data by row in the form of a dictionary Expects the first row of the worksheet to contain the column names
-
open(file_name: str = None, sheet_name: str = None):
-
close
()¶
-
get_data
(max_rows=1000)¶
-
Examples¶
Import from the stdin
in CSV format and upload in native query format (see next section).
import sys
from lwetl import Jdbc, CsvImport, NativeUploader
jdbc = Jdbc('scott')
with NativeUploader(jdbc,'TARGET_TABLE', commit_mode=lwetl.UPLOAD_MODE_COMMIT) as upl:
# read CSV from stdin
with CsvImport(sys.stdin) as csv:
for r in csv.get_data():
upl.insert(r)
Import from an excel 2007+ spreadsheet and upload using parameterized SQL syntax (see next section).
import sys
from lwetl import Jdbc, XlsxImport, ParameterUploader
jdbc = Jdbc('scott')
table = 'TARGET_TABLE'
# alternative to with statement
xls = XlsxImport()
xls.open(table + '.xlsx')
with ParameterUploader(jdbc,table, commit_mode=lwetl.UPLOAD_MODE_COMMIT) as upl:
for r in xls.get_data():
upl.insert(r)
if upl.rowcount > 1000:
upl.commit()
if upl.rowcount > 0:
upl.commit()
xls.close()
Upload models¶
Operational modes¶
Import into a database has the following modes of operation:
- UPLOAD_MODE_DRYRUN
- SQL statements are generated, but not send to the database.
- UPLOAD_MODE_PIPE
- SQL statements are generated and piped for futher processing. The database itself is not touched.
- UPLOAD_MODE_ROLLBACK
SQL statements are generated and executed to the database. However, the commit statement performs a rollback instead.
Warning
This mode is not compatible with a database connection in auto-commit mode. It will also fail if the user sends commit commands independently.
- UPLOAD_MODE_COMMIT
- SQL statements are generated and executed to the database. However, the commit statement performs a rollback instead.
Classes¶
-
class
NativeUploader
(jdbc: Jdbc, table: str, fstream=None, commit_mode=UPLOAD_MODE_DRYRUN, exit_on_fail=True)¶ Upload data into a table with native SQL (no parameters in the jdbc execute command).
Parameters: - jdbc (Jdbc) – The target database connection
- table (str) – Name of the table in the database to insert the data
- fstream –
- commit_mode (str) – The upload mode, see Operational modes.
- exit_on_fail (bool) – Clear the commit buffer and exit if an insert, update, or delete command fails.
-
insert(data: dict):
Insert into the table
Parameters: data (dict) – a dictionary of key (column name) and values. Keys, which do not correspond to an existing column names are ignored.
-
update(data: dict, where_clause):
Update an existing row in the table
Parameters: - data (dict) – a dictionary of key (column name) and values. Keys, which do not correspond to an existing column names are ignored.
- where_clause (None,str,dict) –
filter for column selection. Valid formats for the where clause are:
None
- updates all columns.
str
- raw SQL WHERE clause (the keyword WHERE may be omitted).
dict
- keys are column names. Non exisiting column names are ignored. Multiple columns are combined
with the AND statement. The value may be:
- a value (results in COLUMN_NAME = VALUE)
- a string with an operator and value, e.g.,
LIKE 'ABC%'
- a tuple (operator,value), e.g.,
('>=', 7)
-
delete(where_clause):
Delete rows in the table
Parameters: where_clause (None,str,dict) – filter for the columns to delete. Formats are identical to the update
statement.
-
commit
()¶ Processes previous insert/update/delete statements depending on the Operational modes of the instance.
- UPLOAD_MODE_COMMIT
- sends a commit statement to the database
- UPLOAD_MODE_ROLLBACK
- sends a rollback statement to the database
- UPLOAD_MODE_DRYRUN
- does nothing
- UPLOAD_MODE_PIPE
- work in progress
Warning
This mode is not compatible with a database connection in auto-commit mode. It will also fail if the user sends commit commands independently.
-
add_counter(columns: (str, list, set, tuple)):
Mark columns as counters. Assumes the column type is a number. Queries the maximum number of each column and then adds the next value (+1) in the column on each insert.
Parameters: columns (str,list,set,tuple) – names of the columns to add. May be a (comma-separated) string, or a list type.
-
class
ParameterUploader
(self, jdbc: Jdbc, table: str, fstream=None, commit_mode=UPLOAD_MODE_DRYRUN, exit_on_fail=True)¶ Upload data into a table using parameterized SQL commands. See the section NativeUploader for details on the command line arguments.
-
insert(data: dict):
Insert into the table, see the NativeUploader for details.
-
update(data: dict, where_clause):
Update an existing row in the table, see the NativeUploader for details.
-
delete(where_clause):
Delete existing rows from the table, see the NativeUploader for details.
-
commit
()¶ Processes previous insert/update/delete statements depending on the Operational modes of the instance. See the NativeUploader for details
-
add_counter(columns: (str, list, set, tuple)):
Mark columns as counters. Assumes the column type is a number. Queries the maximum number of each column and then adds the next value (+1) in the column on each insert. See the NativeUploader for details
-
-
class
MultiParameterUploader
(jdbc: Jdbc, table: str, fstream=None, commit_mode=UPLOAD_MODE_DRYRUN, exit_on_fail=True)¶ Upload data into a table using the jdbc executemany parameterized command.
-
insert(data: dict):
Insert into the table, see the NativeUploader for details.
-
commit
()¶ Processes previous insert/update/delete statements depending on the Operational modes of the instance. See the NativeUploader for details
-
add_counter(columns: (str, list, set, tuple)):
Mark columns as counters. Assumes the column type is a number. Queries the maximum number of each column and then adds the next value (+1) in the column on each insert. See the NativeUploader for details
-
Export from a database¶
Formatters are intended to output table data. Supported formatters are:
- TextFormatter
- Outputs tables in plain text format with fixed-width columns.
- CvsFormatter
- Outputs tables in CSV format.
- XmlFormatter
- Outputs tables in XML format
- XlsxFormatter
- Outputs tables in EXCEL xlsx format
All formatters may be used in the following ways:
Example 1: function call
from lwetl import Jdbc, TextFormatter
jdbc = Jdbc('scott')
sql = 'SELECT * FROM MY_TABLE ORDER BY ID'
fmt = TextFormatter()
fmt(jdbc=jdbc, sql=sql)
Example 2: with statement
from lwetl import Jdbc, TextFormatter
jdbc = Jdbc('scott')
cur = jdbc.execute('SELECT * FROM MY_TABLE ORDER BY ID',cursor = None)
formatter = TextFormatter()
with TextFormatter(cursor=cur) as fmt:
fmt.header()
for row in jdbc.get_data(cur):
fmt.write(row)
fmt.footer()
Example 3: open/close
from lwetl import Jdbc, TextFormatter
jdbc = Jdbc('scott')
cur = jdbc.execute('SELECT * FROM MY_TABLE ORDER BY ID',cursor = None)
fmt = TextFormatter()
fmt.open(cursor=cur)
fmt.header()
for row in jdbc.get_data(cur):
fmt.write(row)
fmt.footer()
fmt.close()
Below onl the TextFormatter is described in detail. For the otherones only the differences are mentioned.
-
TextFormatter():
Outputs a table in plain text format with fixed-width columns.
-
__init__(*args, **kwargs):
Instantiate. All arguments are optional.
Parameters: - cursor (Cursor) – cursor generated by
jdbc.execute()
- filename_or_stream ((str,TextIOWrapper,StringIO)) – specifier of the output stream. May be a filename (string) or a stream object.
- append (bool) – append the specified file, rather then creating a new one. Defaults to False.
- column_width (int) – the width of each text column. (Only used in this class)
from lwetl import TextFormatter fmt = TextFormatter(cursor=cursor, filename_or_stream='myoutput.txt', append=True)
- cursor (Cursor) – cursor generated by
-
__call__(*args, **kwargs):
Write a table in a single statement, see Example1 above.
Parameters: - jdbc (Jdbc) – The Jdbc connection
- sql (str) – The SQL to parse
Also accepts all arguments of the
__init__()
statement with the exception of the cursor.
-
open
(*args, **kwargs)¶ Opens the file or stream for writing. Takes the same arguments as the
__init__()
statement.
-
close
()¶ Closes the output file or stream (if applicable)
-
format
(row)¶ Format the row of data.
Parameters: row ((list,tuple)) – a row of data. Returns: a string.
-
header
()¶ Write the header (column names) to the specified file or stream.
-
write
(row: list)¶ Writes the output of
format(row)()
to the specified output stream.
-
-
CsvFormatter():
Outputs a table in CSV format. The functionality is identical to the
TextFormatter
-
all fuctions
Parameters: separator (str) – Specifies the CSV column separator. Defaults to ‘;’
-
-
XmlFormatter():
Outputs a table in XML format. The functionality is identical to the
TextFormatter
-
all fuctions
Parameters: - dialect (str) – Specifies the XML dialect: ‘excel’, ‘value’, or ‘plain’. Defaults to ‘excel’
- pretty_print (bool) – Output the xml in formatted mode, instead of compact. Defaults to False.
- sheet_name (str) – Specifies the name of the worksheet. Defaults to ‘Sheet1’
-
next_sheet
(cursor, sheet_name=None)¶ Initiates a new sheet with a new cursor.
Parameters: - cursor (Cursor) – cursor generated by
jdbc.execute()
- sheet_name (str) – name of the work sheet. Uses a counter like ‘SheetN’ if not specified.
- cursor (Cursor) – cursor generated by
Example:
from lwetl import Jdbc, XmlFormatter jdbc = Jdbc('scott') fmt = None for table in ['MY_TABLE1', 'MY_TABLE_2']: cur = jdbc.execute('SELECT * from {0} ORDER BY ID'.format(table), cursor=None) if fmt is None: fmt = XmlFormatter() fmt.open(cursor='cur') else: fmt.next_sheet(cur) fmt.header() for row in jdbc.get_data(cur): fmt.write(row) fmt.close()
-
-
XlsxFormatter(jdbc: Jdbc):
Outputs a table in a Excel 2007+ file. The functionality is identical to the
XmlFormatter
butdialect
andpretty_print
are not supported. Instead, the argumentpretty=True
in the open method, will reformat the xlsx file to autoset the column width and print the header line in bold.Warning
Stream output is not supported. Only valid file-names are accepted for the argument
filename_or_stream
.