Hive download a file - sorry
Usage
DB-API
frompyhiveimportpresto# or import hivecursor=presto.connect('localhost').cursor()cursor.execute('SELECT * FROM my_awesome_data LIMIT 10')printcursor.fetchone()printcursor.fetchall()DB-API (asynchronous)
frompyhiveimporthivefromTCLIService.ttypesimportTOperationStatecursor=hive.connect('localhost').cursor()cursor.execute('SELECT * FROM my_awesome_data LIMIT 10',async=True)status=cursor.poll().operationStatewhilestatusin(TOperationState.INITIALIZED_STATE,TOperationState.RUNNING_STATE):logs=cursor.fetch_logs()formessageinlogs:printmessage# If needed, an asynchronous query can be cancelled at any time with:# cursor.cancel()status=cursor.poll().operationStateprintcursor.fetchall()In Python 3.7 async became a keyword; you can use async_ instead:
cursor.execute('SELECT * FROM my_awesome_data LIMIT 10',async_=True)SQLAlchemy
First install this package to register it with SQLAlchemy (see ).
fromsqlalchemyimport*fromsqlalchemy.engineimportcreate_enginefromsqlalchemy.schemaimport*# Prestoengine=create_engine('presto://localhost:8080/hive/default')# Hiveengine=create_engine('hive://localhost:10000/default')logs=Table('my_awesome_data',MetaData(bind=engine),autoload=True)printselect([func.count('*')],from_obj=logs).scalar()Note: query generation functionality is not exhaustive or fully tested, but there should be no problem with raw SQL.
Passing session configuration
# DB-APIhive.connect('localhost',configuration={'hive.exec.reducers.max':'123'})presto.connect('localhost',session_props={'query_max_run_time':'1234m'})# SQLAlchemycreate_engine('presto://user@host:443/hive',connect_args={'protocol':'https','session_props':{'query_max_run_time':'1234m'}})create_engine('hive://user@host:10000/database',connect_args={'configuration':{'hive.exec.reducers.max':'123'}},)# SQLAlchemy with LDAPcreate_engine('hive://user:password@host:10000/database',connect_args={'auth':'LDAP'},)Testing
Run the following in an environment with Hive/Presto:
./scripts/make_test_tables.sh virtualenv --no-site-packages env source env/bin/activate pip install -e . pip install -r dev_requirements.txt py.testWARNING: This drops/creates tables named , , and , plus a database called .
Updating TCLIService
The TCLIService module is autogenerated using a file. To update it, the file can be used: . When left blank, the version for Hive 2.3 will be downloaded.
-
-
-