java,mysql,hive,sqoop,hortonworks-data-platform
Yes you can do it via ssh. Horton Sandbox comes with ssh support pre installed. You can execute the sqoop command via ssh client on windows. Or if you want to do it programaticaly (thats what I have done in java) you have to follow this step. Download sshxcute java...
Use ctrl+C terminate HBase and re-enter hbase will work.
hive,hbase,sqoop,apache-sqoop,apache-hive
HBase-Hive Integration: Creating an external table in hive for HBase table allows you to query HBase data o be queried in Hive without the need for duplicating data. You can just update or delete data from HBase table and you can view the modified table in Hive too. Example: Consider...
Fortunately I found Answer of my question myself. I have to include a jar file in build path hadoop-0.20.2+737-core.jar instead of hadoop-0.20.2-core.jar. It looks like it is the modified version of the same file containing JobConf Class containing getCredentials() method. Problem is solved but I am still confused about those...
[solved] There was no need to change java_home (or) hadoop_mapred_home (or) sqoop_home As the error suggests, it was not able to find sqoop library. Whereas I used sqoop 1.4.4 library in my program. Please note that I wasn't using any build tool (Maven, SBT or anything) I was using normal...
You can either merge the files creating a new larger file or you can set the number of mappers to 1 using -m 1 or --num-mappers 1.
There is a similar bug in sqoop reported here .Please verify the correct MySQL connector version and sqoop version that you are using and update the version as required.Hope this will solve your problem. Thanks.
Exporting data from HBase to RDBMS is not supported so far. One work around is you can export HBase data into a file in HDFS and that can be exported into RDBMS. For more information you can check Sqoop1 vs Sqoop2...
Using Sqoop to transfer few terabytes from RDBMS to HDFS is a great idea, highly recommended. This is Sqoop's intended use case and it does do reliably. Flume is mostly intended for streaming data, so if the files all have events, and you get new files frequently, then Flume with...
The basic problem is that mongo stores its data in BSON format (binary JSON), while you hdfs data may have different formats (txt, sequence, avro). The easiest thing to do would be to use pig to load your results using this driver: https://github.com/mongodb/mongo-hadoop/tree/master/pig into mongo db. You'll have to map...
In columns option provide all those columns name which you want to import except the partition column. We don't specify the partition column in --columns option as it get automatically added. Below is the example: I am importing DEMO2 table with columns ID,NAME, COUNTRY. I have not specified COUNTRY column...
hadoop,hive,sqoop,hadoop-partitioning
Hive default delimiter is ctrlA, if you don't specify any delimiter it will take default delimiter. Add below line in your hive script . row format delimited fields terminated by '\t'...
Sqoop is for getting data from relational databases only at the moment. Try using "distcp" for getting data from S3. The usage is documented here: http://wiki.apache.org/hadoop/AmazonS3 In the section "Running bulk copies in and out of S3"...
shell,hadoop,automation,hive,sqoop
Run script : $ ./script.sh 20 //------- for 20th entry [email protected]:~/ramu$ cat script.sh #!/bin/bash PART_ID=$1 TARGET_DIR_ID=$PART_ID echo "PART_ID:" $PART_ID "TARGET_DIR_ID: "$TARGET_DIR_ID sqoop import --connect jdbc:oracle:thin:@hostname:port/service --username sqoop --password sqoop --query "SELECT * FROM ORDERS WHERE orderdate = To_date('10/08/2013', 'mm/dd/yyyy') AND partitionid = '$PART_ID' AND rownum < 10001 AND \$CONDITIONS" --target-dir...
The location into which you placed the file does not appear to be correct. For a table "test" you should put a file underneath a directory test. But your command hadoop fs -put test /user/cloudera creates a file called test. You would likely find more success as follows: hadoop fs...
Try adding these arguments to the export statement --input-null-string "\\\\N" --input-null-non-string "\\\\N" From the documentation: If --input-null-string is not specified, then the string "null" will be interpreted as null for string-type columns. If --input-null-non-string is not specified, then both the string "null" and the empty string will be interpreted as...
java,hadoop,sqoop,apache-sqoop
Range of salary is 5000 - 70000 (i.e. min 5000, max 70000). Salary is divided salary into 4 classes. (70000 - 5000 )/4=16250 Hence, split 1 : from 5000 to 21,250(=5000+16250) split 2 : from 21250 to 37500(=21250+16250) split 3 : from 37500 to 53750(=37500+16250) split 4 : from 53750...
YEAR and MONTH are not valid Teradata SQL, both are ODBC syntax, which is automatically rewritten by the ODBC driver. Try EXTRACT(YEAR FROM A.dt) instead....
sql,postgresql,shell,hadoop,sqoop
I solved the problem by changing my reduce function so that if there were not the correct amount of fields to output a certain value and then I was able to use the --input-null-non-string with that value and it worked.
hdfs://localhost:9000/ is hadoop hdfs address. you can change property in your app or upload your jar on hdfs. You display ls command of your linux file system but hdfs://localhost:9000/ is addres of hadoop hdfs file system....
adding this --fields-terminated-by '^' to sqoop import solved similar problem of mine
Here is the complete procedure for installation and import and export commands for Sqoop. Hope fully it may be helpful to some one. This one is tried and tested by me and actually works. Download : apache.mirrors.tds.net/sqoop/1.4.4/sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz sudo mv sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz /usr/lib/sqoop copy paste followingtwo lines in .bashrc export SQOOP_HOME=/usr/lib/sqoop export...
Looks like your jdbc url is incorrect, it should be like jdbc:mysql://localhost/bbdatabank ...
It's difficult to say, but chances are conversation_id is not unique. For more interactive help on this subject, try the sqoop mailing lists.
Hue will submit this script through Oozie Sqoop Action. It has a particular way to specify the arguments. Hue also comes with a built in Sqoop example that you could try to modify with your import....
oracle,hadoop,hive,sqoop,hcatalog
Use --map-column-java option to explicitly state the column is of type String. Then --hive-drop-import-delims works as expected (to remove \n from data). Changed Sqoop Command : sqoop import --connect jdbc:oracle:thin:@ORA_IP:ORA_PORT:ORA_SID \ --username user123 --password passwd123 -table SCHEMA.TBL_2 \ --hcatalog-table tbl2 --hcatalog-database testdb --num-mappers 1 \ --split-by SOME_ID --columns col1,col2,col3,col4 --hive-drop-import-delims...
java,hadoop,mapreduce,hdfs,sqoop
If there are no reducers then mapper output is written to HDFS. Even in this case mapper output is not directly written to HDFS but written on individual node disk and then copied over to HDFS. Sqoop is one scenario where it is typically a map only job wherein you...
sqoop,datastax-enterprise,datastax
Sounds like you're trying to do something similar to slide 47 in this deck: http://www.slideshare.net/planetcassandra/escape-from-hadoop The strategy Russell uses there is to use the spark mysql driver, no need to deal with Sqoop. You do have to add the dependency to your spark classpath for it to work. No need...
Try this: 1. Create a directory in HDFS: hdfs dfs -mkdir /usr/lib/sqoop 2. Copy sqoop jar into HDFS: hdfs dfs -put /usr/lib/sqoop/sqoop-1.4.6.jar /usr/lib/sqoop/ 3. Check whether the file exists in HDFS: hdfs dfs -ls /usr/lib/sqoop 4. Import using sqoop: sqoop import --connect jdbc:mysql://localhost/<dbname> --username root --password [email protected] --table <tablename> -m...
If you use --incremental lastmodified mode then your --check-column is a timestamp that does not need to be numeric or unique. See: Sqoop incremental imports....
hadoop,sandbox,sqoop,hortonworks-data-platform
It's in /usr/hdp/2.2.0.0-2041/sqoop/lib
Sqoop does not support creating Hive external tables. Instead you might: Use the Sqoop codegen command to generate the SQL for creating the Hive internal table that matches your remote RDBMS table (see http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_codegen_literal) Modify the generated SQL to create a Hive external table Execute the modified SQL in Hive...
sqoop-merge doesn't support hbase but running a new import (even from other sql table) will override the data in hbase. You can provide a custom where + custom columns to update just the data you need without affecting the rest of the data already stored in hbase: sqoop import --connect...
If it is a version incompatibility issue, then you can give a try to mysql-connector-java-5.1.31.jar as I am using mysql-connector-java-5.1.31.jar with sqoop version 1.4.5. For me, it works for both data import and export use cases.
Flume and Sqoop are both designed to work with different kind of data sources. Sqoop works with any kind of RDBMS system that supports JDBC connectivity. Flume on the other hand works well with streaming data sources like log data which is being generated continuously in your environment. Specifically, Sqoop...
mysql,oracle,postgresql,hive,sqoop
In the newer versions of sqoop you have the --hive_drop-import-delims or --hive-delims-replacement command. See https://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html That will deal with the \r \n and \001 in your string fields. For other replacements your workaround with the REPLACE function is the way to go...
hadoop,mapreduce,apache-pig,sqoop,flume
Question 1: Answer: C Explanation: You need to join User profile records and weblogs. Weblogs is already ingested into HDFS.So inorder to join weblogs with user profile,we need to bring user profile also into HDFS.User profile is residing in OLPT database so to import that to HDFS we need...
hadoop,parameters,connection-string,sqoop
if you want to give your database connection string and credentials then create a file with those details and use --options-file in your sqoop command create a file database.props with the following details: import --connect jdbc:mysql://localhost:5432/test_db --username root --password password then your sqoop import command will look like: sqoop --options-file...
This type of error usually comes when there is a version conflict,So please make sure your Sqoop version is compatible with your Hadoop distribution. And in case if you are using some third party connector for importing data, it should also be compatible with your sqoop version. I am using...
According to my internet search, it is not possible to perform both insert and update directly to postgreSQL DB. Instead you can create a storedProc/function in postgreSQL and you can send data there.. sqoop export --connect <url> --call <upsert proc> --export-dir /results/bar_data Stored proc/function should perform both Update and Insert....
Import a MySQL table into Hive: sqoop import --connect jdbc:mysql://localhost:3306/mysqldatabase --table mysqltablename --username mysqlusername --password mysqlpassword --hive-import --hive-table hivedatabase.hivetablename --warehouse-dir /user/hive/warehouse Changes to be made: mysqldatabase -- Your mysql database name from which the table is to be imported to hive. mysqltablename -- Your mysql table name to be imported...
hadoop,mapreduce,sqoop,oozie,hortonworks-data-platform
So here is the way I solved it. We are using CDH5 to run Camus to pull data from kafka. We run CamusJob which is responsible for getting data from kafka using comman line: hadoop jar... The problem is that new hosts didn't get so-called "yarn-gateway". Cloudera names pack of...
hadoop,hbase,sqoop,apache-spark
I think HBase is not a core component of Hadoop, hence as a client, what should I do? Hbase is not a core component of Hadoop. To use it, you need to install HBase on top of your hadoop cluster. It is dependent on HDFS/Zookeeper. It is not dependent on...
Confirm if you have copied your workflow.xml to hdfs. You need not copy job.properties to hdfs but have to copy all the other files and libraries to hdfs
You need to add all lib files like jdbc drivers, etc in the oozie share lib folder inside sqoop folder . This should resolve your issue. To check the library files invoked/used by the job , go to the job tracker for the corresponding job and in syslogs you will...
mysql,serialization,import,hbase,sqoop
You can represent 0x1E (up arrow) with CHAR(30) and 0x1F (down arrow) with CHAR(31), therefore, you can provide a free query and perform the replacements. This should achieve exactly what you're looking for: sqoop import --connect jdbc:mysql://localhost:3306/[db] \ --username [user] --password [pwd] \ --query 'SELECT CONCAT(email_address,updated_date) as id, REPLACE(REPLACE(modification,":",CHAR(31),uri),"|",CHAR(30),uri) as...
As the error states: Could not load db driver class: dbDriver There are likely two problems: The JDBC URL is probably incorrect The JDBC Jar needs to be included in the workflow For the JDBC URL, make sure it looks like this: jdbc:vertica://VerticaHost:portNumber/databaseName For the JDBC jar, it needs to...
Looks like you need to install and configure a Teradata connector for Sqoop. See here: http://www.cloudera.com/content/support/en/downloads/download-components/download-products/downloads-listing/connectors/teradata.html...
The mysqldump utility is used with the "direct connector". The reason it cannot be found is that the mysqldump binary is either not on your system or not part of the PATH environment variable when Sqoop is running the MapReduce job. Things that will help: It seems like you're running...