Menu
  • HOME
  • TAGS

Ambari dashboard retrieving no statistics

hadoop,hortonworks-data-platform,ganglia

It turns out to be a proxy issue, to access the internet I had to add my proxy details to the file /var/lib/ambari-server/ambari-env.sh export AMBARI_JVM_ARGS=$AMBARI_JVM_ARGS' -Xms512m -Xmx2048m -Dhttp.proxyHost=theproxy -Dhttp.proxyPort=80 -Djava.security.auth.login.config=/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false' When ganglia was trying to access each node in the cluster the request was going via the proxy...

Hadoop Disaster recovery and prevent data loss

hadoop,hdfs,bigdata,hortonworks-data-platform,disaster-recovery

First, if you want a DR solution you need to store this data somewhere outside of the main production site. This implies that the secondary site should have at least the same storage capacity as the main one. Now remember that the main ideas that lead to HDFS were moving...

Java - MySQL to Hive Import where MySQL Running on Windows and Hive Running on Cent OS (Horton Sandbox)

java,mysql,hive,sqoop,hortonworks-data-platform

Yes you can do it via ssh. Horton Sandbox comes with ssh support pre installed. You can execute the sqoop command via ssh client on windows. Or if you want to do it programaticaly (thats what I have done in java) you have to follow this step. Download sshxcute java...

Sqoop incremental import (db schema incorrect)

hortonworks-data-platform,sqoop2

Try this : sqoop import --connect "jdbc:sqlserver://192.168.40.133:1434;database=AdventureWorksLT2012;username=test;password=test" --table ProductModel --hive-import -- --schema SalesLT --incremental append --check-column ProductModelID --last-value "128" ...

Giraph ZooKeeper port problems

hadoop,zookeeper,hortonworks-data-platform,giraph

Yes, you can specify each time you run a Giraph job via using option -Dgiraph.zkList=localhost:2181 Also you can set it up in Hadoop configs and then you don't have to pass on this option each time you submit a Giraph job. For that add the following line in conf/core-site.xml file...

hCatalog page gives error - HortonWorks Sandbox

sandbox,hortonworks-data-platform,hcatalog

I suspect this is due to memory. Your memory should be at least 4096 MB.

Ambari 1.7.0 cannot register datanodes in CentOS cluster

linux,hadoop,hortonworks-data-platform

What do you have specified as the hostname in /etc/ambari-agent/conf/ambari-agent.ini ? I assume that it is 'namenode.localdomain.com'

Why does YARN takes a lot of memory for a simple count operation?

hadoop,mapreduce,hive,yarn,hortonworks-data-platform

A simple count operation involves a map reduce job at the back end. And that involves 10 million rows in your case. Look here for a better explanation. Well this is just for the things happening at the background and execution time and not your question regarding memory requirements. Atleast,...

DataNodes can't talk to NameNode

hadoop,bigdata,hortonworks-data-platform,ambari,hortonworks

Moving NameNode to same network with DataNodes solved the problem. DataNodes are in 192.1.5.* network. NameNode was in 192.1.4.* network. After moving NameNode to 192.1.5.* did the trick for my case....

Hadoop on Google Compute Engine: how to add external software

google-compute-engine,hortonworks-data-platform,google-hadoop

bdutil in fact is designed to support custom extensions; you can certainly edit an existing one for an easy way to get started, but the recommended best-practice is to create your own "_env.sh" extension which can be mixed in with other bdutil extensions if necessary. This way you can more...

Datanode process in not getting started with Hortonworks sandbox Manual Set up

java,hadoop,hortonworks-data-platform,hortonworks

First delete all contents from hdfs folder: Value of hadoop.tmp.dir rm -rf /grid/hadoop/hdfs Make sure that dir has right owner and permission (Username according to your system) sudo chown hduser:hadoop -R /grid/hadoop/hdfs sudo chmod 777 -R /grid/hadoop/hdfs Format the namenode: hadoop namenode -format Try this: sudo chown -R hdfs:hadoop /grid/hadoop/hdfs/dn...

Ambari 2.0 installation, “” failure

hadoop,bigdata,hortonworks-data-platform,ambari,hortonworks

Use a FQHN (fully-qualified-hostname) on both ambari server and all its client nodes.

Flume-ng hdfs sink .tmp file refresh rate control proprty

cloudera,flume,hortonworks-data-platform,flume-ng,flume-twitter

Consider decreasing your channel's capacity and transactionCapacity settings: capacity 100 The maximum number of events stored in the channel transactionCapacity 100 The maximum number of events the channel will take from a source or give to a sink per transaction These settings are responsible for controlling how many events get...

Where is hadoop examples directory located in Horton Dataworks Platform sandbox?

hadoop,hadoop2,hortonworks-data-platform

I am not certain whether the examples were still shipped with HDP. You would want to search for hadoop-examples\*.jar. If you were unable to locate the examples jar then you might need to download it. Unfortunately it appears the version 2 hadoop does not have maven for it?? http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-examples ...

Updating IP addresses in Apache Ambari

hortonworks-data-platform,ambari

This great blog post helped us: http://www.swiss-scalability.com/2015/01/rename-host-in-ambari-170.html Basically you will need to log into Ambari's database. (Not the GUI, the actual backend database). It's best to read the blog post in its entirety, but I am appending the important secret sauce that actually makes things happen. If you're on mysql:...

HDP 2.2 manual installation namenode format - wrong number of parameters?

hadoop,hadoop2,hortonworks-data-platform

For formatting the NameNode, you can use the following command run as the 'hdfs' admin user: /usr/bin/hdfs namenode -format For starting up the NameNode daemon, use the hadoop-daemon.sh script: /usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start namenode "-config $HADOOP_CONF_DIR" is an optional parameter here in case you want to reference a specific Hadoop configuration directory....

Getting Unexpected Error while accessing data form HortonBox Hadoop Hive server to tableau.?

tableau,hortonworks-data-platform,tableau-server

I got an answer,First we will need to go to Hive and enter the Hive query Query:grant SELECT on table "table-name" to user hue; where “table” is the table you want user “hue” to view....

HDP 2.2 Sandbox Could not find SQOOP directory

hadoop,sandbox,sqoop,hortonworks-data-platform

It's in /usr/hdp/2.2.0.0-2041/sqoop/lib

Sqoop Job via Oozie HDP 2.1 not creating job.splitmetainfo

hadoop,mapreduce,sqoop,oozie,hortonworks-data-platform

So here is the way I solved it. We are using CDH5 to run Camus to pull data from kafka. We run CamusJob which is responsible for getting data from kafka using comman line: hadoop jar... The problem is that new hosts didn't get so-called "yarn-gateway". Cloudera names pack of...

Hue Beeswax / HCat no longer working (kerberos default user) after migration to HDP2.2

hive,kerberos,hortonworks-data-platform,hue

Okay, found it (had to debug the full python stack to understand). It's not really advertised, but some hue.ini parameter names have changed: beeswax_server_host --> hive_server_host beeswax_server_port --> hive_server_port It was defaulting hive_server_host to localhost, which is not correct on a secure cluster....

Oozie or Shell Script for Workflow orchestration in Hadoop

hadoop,oozie,cloudera-cdh,hortonworks-data-platform,oozie-coordinator

Generally speaking, oozie has several advantages here: Generate a DAG each time so you can have a direct view on your workflow. Easier access to the log files for each action such as hive, pig, etc. You will have your full history for each run of your task. Better schedule...

Shell command to transfer files from HDFS to local filesystem in Hadoop 2.6.9

hadoop,hadoop2,hortonworks-data-platform

hdfs dfs -get /hdfs/path /local/path hdfs dfs -put /local/path /hdfs/path...

Ambari is not able to start the Namenode

hadoop,hortonworks-data-platform,ambari

Partually fixed: it is necessary to stop all the HDFS services (Journal Node, Namenodes and Datanodes) before editing the hdfs-site.xml file. Then, of course, Ambari "start button" cannot be used because the configuration would be smashed... thus it is necessary to re-start all the services manually. This is not the...

HDP host registration failure

hortonworks-data-platform

After spending plenty of time, I assumed that despite of having Internet connectivity, the local repositories will be needed. I installed Apache server and made my repositories accessible as per the documentation. Then, in ‘Advanced Repository Options’, replaced the web url with the local repository URL and it registered the...

oozie 4.1.0 louncher fail with OozieLauncherInputFormat$EmptySplit not found

hadoop,oozie,hortonworks-data-platform

Hortonworks Hadoop companion files contain oozie-site.xml property oozie.services with missing entry which enables ShareLibService. Which causes that new Shared Lib feature doesn't work as the endpoint is not registered. To fix this add org.apache.oozie.service.ShareLibService entry to oozie.services list. Be careful as the services are not independent so the order matters!...

Loading csv file into HDFS using Flume (spool directory as source)

hadoop,hadoop-streaming,flume,hortonworks-data-platform,flume-ng

i was using hortonworks sandbox v2.2 , after long time of debugging, i found out there's some conflicts between spark version i installed manually "v1.2" and hortonworks sandbox libraries,so i decided to use cloudera quickstart 5.3.0 and now everything working fine

apache falcon's role in hadoop eco system

apache,hadoop,hdfs,bigdata,hortonworks-data-platform

Apache Falcon simplifies the configuration of data motion with: replication; lifecycle management; lineage and traceability. This provides data governance consistency across Hadoop components. Falcon replication is asynchronous with delta changes. Recovery is done by running a process and swapping the source and target. Data loss – Delta data may...

handling oracle's ROWID in apache hive

oracle,hadoop,hive,hiveql,hortonworks-data-platform

Hive doesn't have the feature of unique identifier of each row (rowid). But if you don't have any primary key or unique key values, you can use the analytical function row_number.

HDP Cluster | SSH password less connection failing

hadoop,admin,hortonworks-data-platform

Issue is fixed. There was a permission issue with user folder i.e. /home/hduser. Somehow permission got changed.

Why is hive Metatool updatelocation called upon Ambari migration from 1.6.0 to 2.0.0 to move locations to unwanted places?

hadoop,hive,hortonworks-data-platform,ambari

See if this JIRA helps with the issue you are hitting... https://issues.apache.org/jira/browse/AMBARI-10360...

Host and port to use to list a directory in hdfs

java,hadoop,hdfs,hortonworks-data-platform

Check the value of property fs.defaultFS in core-site.xml this contains the ip-address/hostname and port on which NameNode daemon should bind to when it start's up. I see that you are using hortonworks sandbox, here is the property in core-site.xml and its located in /etc/hadoop/conf/core-site.xml <property> <name>fs.defaultFS</name> <value>hdfs://sandbox.hortonworks.com:8020</value> </property> So, you...

HDP 2.0 Oozie Error: E0803 : E0803: IO error, E0603

hadoop,oozie,hortonworks-data-platform

Wrong user used during the installation process. This solved the problem: sudo -u oozie /usr/lib/oozie/bin/ooziedb.sh create -sqlfile /usr/lib/oozie/oozie.sql -run Instead of: sudo /usr/lib/oozie/bin/ooziedb.sh create -sqlfile /usr/lib/oozie/oozie.sql -run...

Import TSV file into hbase table

hadoop,hive,hbase,hortonworks-data-platform

Do you have a table already created in Hbase ? You will first have to create a table in Hbase with 'd' as a column family and then you can import this tsv file into that table.

Hive: Resources added using script getting cleared in Hortonworks?

hadoop,hive,yarn,hortonworks-data-platform

I have been using Hortonworks, and you need to add the file/jar within the same session - as you have discovered.

How to install mahout using ambari server

hadoop,bigdata,mahout,hortonworks-data-platform

Hello fellow Hortonworker! Mahout is in the HDP repositories, but it's not available in the ambari install wizard (i.e. Services->Add Service). Therefore the only way to install it is via: yum install mahout As noted here, you should only install it on the master node. Also note that Mahout is...

Apache Falcon vs Wandisco Non-stop hadoop

apache,hadoop,hdfs,hortonworks-data-platform,database-mirroring

(Disclaimer: I work at WANdisco.) My view is that the products are complementary. Falcon does a lot of things besides data transfer, like setting up data workflow stages. WANdisco's products do active-active data replication (which means that data can be used equivalently from both the source and target clusters). In...