Command Line
We've moved! To improve customer experience, the Collibra Data Quality User Guide has moved to the
Collibra Documentation Center
as part of the Collibra Data Quality 2022.11 release. To ensure a seamless transition,
dq-docs.collibra.com
will remain accessible, but the DQ User Guide is now maintained exclusively in the Documentation Center.
Where Scale meets Data Science. Scale linearly with your data by adding executors and/or memory
-f "file:///Users/home/salary_data.csv" \
-d "," \
-rd "2018-01-08" \
-ds "salary_data"
-numexecutors 2 \
-executormemory 2g
If Owl is run on an edge node on a popular hadoop distribution such as HDP, CDH, EMR it will automatically register the jobs with Yarn Resource Manager.
Owl can also run using spark master by using the -master input and passing in spark:url
Owl can run in standalone most but naturally will not distribute the processing beyond the hardware it was activated on.
Options
|
Description
|
---|---|
deploymode
|
spark deploymode option
|
drivermemory
|
driver memory example 3G for local space
|
executorcores
|
spark executor cores
|
executormemory
|
spark executor memory option example 3G
|
master
|
overrides local[*], i.e. spark://myhost:7077, yarn-client, yarn-cluster
|
sparkprinc
|
kerberos principal name ex:[email protected]
|
spark-submit\
--driver-class-path /opt/owl/drivers/postgres42/postgresql-42.2.4.jar\
--driver-library-path /opt/owl/drivers/postgres42/postgresql-42.2.4.jar\
--driver-memory 3g --num-executors2--executor-memory 1g\
--masterspark://Kirks-MBP.home:7077\
--classcom.owl.core.cli.OwlCheck /opt/owl/bin/owl-core-trunk-jar-with-dependencies.jar\
-uuser-ppass-cjdbc:postgresql://xyz.chzid9w0hpyi.us-east-1.rds.amazonaws.com/postgres\
-dsaccounts-rd2019-05-05-dssafeoff-q“select *accounts"
司机org.postgresql.Driver-lib/opt/owl/drivers/postgres42/
spark-submit\
--driver-class-path /opt/owl/drivers/postgres42/postgresql-42.2.4.jar\
--driver-library-path /opt/owl/drivers/postgres42/postgresql-42.2.4.jar\
--confspark.driver.extraJavaOptions=-Dlog4j.configuration=file:///opt/owl/config/log4j-TRACE.properties\
--confspark.executor.extraJavaOptions=-Dlog4j.configuration=file:///opt/owl/config/log4j-TRACE.properties\
--files/opt/owl/config/log4j-TRACE.properties\
--driver-memory 2g --num-executors2--executor-memory 1g--masterspark://Kirks-MBP.home:7077\
--classcom.owl.core.cli.OwlCheck /opt/owl/bin/owl-core-trunk-jar-with-dependencies.jar\
-uus-ppass-cjdbc:postgresql://xyz.chzid9w0hpyi.us-east-1.rds.amazonaws.com/postgres\
-dsaumdt-rd2019-05-05-dssafeoff-q“select *aum_dt"\
司机org.postgresql.Driver-lib/opt/owl/drivers/postgres42/\
-connectionpropsfetchsize=6000-masterspark://Kirks-MBP.home:7077\
-corroff-histoff-statsoff\
-columnnameupdt_ts-numpartitions4-lowerbound1557597987353-upperbound1557597999947
Last modified5mo ago