Posts tagged with: curl

accessing Hive tables with curl and webHCat

For a quick and easy access, you can think about using WebHCat, a REST interface for accessing HCatalog, though Hive.

Let’s assume we’re in a kerberized cluster (you cannot be in an unkerberized cluster, remember…)

First, we check which port is used, default is 50111, in the Hive / webhcat-site.xml (or in the Hive configuration within the Ambari interface)

templeton (webHCat) port

templeton (webHCat) port

templeton is the former name for WebHCat.

Let’s try to do a curl on webHCat to see the DDL of the default database :

[root@sandbox ~]# curl -i --negotiate -u: "http://sandbox.hortonworks.com:50111/templeton/v1/ddl/database/default"
HTTP/1.1 401 Authentication required
WWW-Authenticate: Negotiate
Set-Cookie: hadoop.auth=; Path=/; Expires=Thu, 01-Jan-1970 00:00:00 GMT; HttpOnly
Cache-Control: must-revalidate,no-cache,no-store
Content-Type: text/html;charset=ISO-8859-1
Content-Length: 1328
Server: Jetty(7.6.0.v20120127)

<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 401 Authentication required</title>
</head>
<body>
<h2>HTTP ERROR: 401</h2>
<p>Problem accessing /templeton/v1/ddl/database/default. Reason:
<pre> Authentication required</pre></p>
<hr /><i><small>Powered by Jetty://</small></i>

Hmmm, obviously : we have to kinit ourselves before being able to access HCatalog.

[root@sandbox ~]# kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs
[root@sandbox ~]# curl -i --negotiate -u: "http://sandbox.hortonworks.com:50111/templeton/v1/ddl/database/default"
HTTP/1.1 401 Authentication required
WWW-Authenticate: Negotiate
Set-Cookie: hadoop.auth=; Path=/; Expires=Thu, 01-Jan-1970 00:00:00 GMT; HttpOnly
Cache-Control: must-revalidate,no-cache,no-store
Content-Type: text/html;charset=ISO-8859-1
Content-Length: 1328
Server: Jetty(7.6.0.v20120127)

HTTP/1.1 500 Server Error
Set-Cookie: hadoop.auth="u=hdfs&p=hdfs@HORTONWORKS.COM&t=kerberos&e=1475885113041&s=p+38gIJagH2o1pTkoGK+af3a6Ks="; Path=/; Expires=Sat, 08-Oct-2016 00:05:13 GMT; HttpOnly
Content-Type: application/json
Transfer-Encoding: chunked
Server: Jetty(7.6.0.v20120127)

{"error":"User: HTTP/sandbox.hortonworks.com@HORTONWORKS.COM is not allowed to impersonate hdfs"}

 

This is a fairly common message : as you’re requesting a REST Api, your request is encapsulated with the so-called SPNego  token, that you can think as the “Kerberos for HTTP”.

You must then be able to authenticate with SPNego token, but also HTTP should be able to impersonate you (meaning HTTP will do the request on behalf of your username)

Those proxyuser parameters could be found in the HDFS core-site.xml :

HTTP proxyuser configuration

HTTP proxyuser configuration

So here, we can see HTTP can impersonate only users belonging to the group users

[root@sandbox ~]# id hdfs
uid=505(hdfs) gid=501(hadoop) groups=501(hadoop),503(hdfs)
[root@sandbox ~]# id ambari-qa
uid=1001(ambari-qa) gid=501(hadoop) groups=501(hadoop),100(users)

That’s right, hdfs doesn’t belong to that group. However, ambari-qa does ! let’s kinit ourselves to be ambari-qa.

[root@sandbox ~]# kinit -kt /etc/security/keytabs/smokeuser.headless.keytab ambari-qa
[root@sandbox ~]# curl -i --negotiate -u: "http://sandbox.hortonworks.com:50111/templeton/v1/ddl/database/default"
HTTP/1.1 401 Authentication required
WWW-Authenticate: Negotiate
Set-Cookie: hadoop.auth=; Path=/; Expires=Thu, 01-Jan-1970 00:00:00 GMT; HttpOnly
Cache-Control: must-revalidate,no-cache,no-store
Content-Type: text/html;charset=ISO-8859-1
Content-Length: 1328
Server: Jetty(7.6.0.v20120127)

HTTP/1.1 200 OK
Set-Cookie: hadoop.auth="u=ambari-qa&p=ambari-qa@HORTONWORKS.COM&t=kerberos&e=1475885666292&s=/WGJZIe4BRKBoI4UmxfHUv8r7MU="; Path=/; Expires=Sat, 08-Oct-2016 00:14:26 GMT; HttpOnly
Content-Type: application/json
Transfer-Encoding: chunked
Server: Jetty(7.6.0.v20120127)

{"location":"hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse","ownerType":"ROLE","owner":"public","comment":"Default Hive database","database":"default"}

That’s it, you got your DDL !

 


Ambari tips & tricks

Restarting some components

(including clients, which you can’t put to another state than “INSTALLED”) :

curl -uadmin:admin -H 'X-Requested-By: ambari' -X POST -d '
{
"RequestInfo":{
"command":"RESTART",
"context":"Restart ZooKeeper Client and HDFS Client",
"operation_level":{
"level":"HOST",
"cluster_name":"hdp-cluster"
}
},
"Requests/resource_filters":[
{
"service_name":"ZOOKEEPER",
"component_name":"ZOOKEEPER_CLIENT",
"hosts":"gw.example.com"
},
{
"service_name":"ZOOKEEPER",
"component_name":"ZOOKEEPER_SERVER",
"hosts":"gw.example.com,nn.example.com,dn1.example.com"
}
]
}' http://gw.example.com:8080/api/v1/clusters/hdp-cluster/requests

As indicated in the wiki, the RESTART refreshs the configs.

 

Delete a host from Ambari

// get all COMPONENTS for the host

[root@uabdfes03 ~]# curl -u admin:admin -H "X-Requested-By:ambari" -i -X GET http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/hosts/$HOSTNAME/host_components

// delete all COMPONENTS for this HOST
[root@host ~]# for COMPONENT in ZOOKEEPER_CLIENT YARN_CLIENT PIG OOZIE_CLIENT NODEMANAGER MAPREDUCE2_CLIENT HIVE_CLIENT HDFS_CLIENT HCAT HBASE_REGIONSERVER HBASE_CLIENT GANGLIA_MONITOR DATANODE; do curl -u admin:admin -H "X-Requested-By:ambari" -i -X DELETE http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/hosts/$HOSTNAME/host_components/$COMPONENT; done
// delete HOST
[root@host ~]# curl -u admin:admin -H "X-Requested-By:ambari" -i -X DELETE http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/hosts/$HOSTNAME

Delete a service

(for example STORM)

// get the components for that service
[vagrant@gw ~]$ curl -u admin:admin -X GET  http://gw.example.com:8080/api/v1/clusters/hdp-cluster/services/STORM
// stop the service
[vagrant@gw ~]$ curl -u admin:admin -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo":{"context":"Stop Service"},"Body":{"ServiceInfo":{"state":"INSTALLED"}}}' http://gw.example.com:8080/api/v1/clusters/hdp-cluster/services/STORM
//stop each component on each host
[vagrant@gw ~]$ for COMPONENT_NAME in DRPC_SERVER NIMBUS STORM_REST_API STORM_UI_SERVER SUPERVISOR; do curl -u admin:admin -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo":{"context":"Stop Component"},"Body":{"HostRoles":{"state":"INSTALLED"}}}' http://gw.example.com:8080/api/v1/clusters/hdp-cluster/hosts/gw.example.com/host_components/${COMPONENT_NAME}; done
// stop service components
[vagrant@gw ~]$ for COMPONENT_NAME in DRPC_SERVER NIMBUS STORM_REST_API STORM_UI_SERVER SUPERVISOR; do curl -u admin:admin -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo":{"context":"Stop All Components"},"Body":{"ServiceComponentInfo":{"state":"INSTALLED"}}}' http://gw.example.com:8080/api/v1/clusters/hdp-cluster/services/STORM/components/${COMPONENT_NAME}; done
// delete the service
[vagrant@gw ~]$ curl -u admin:admin -H 'X-Requested-By: ambari' -X DELETE http://gw.example.com:8080/api/v1/clusters/hdp-cluster/services/STORM

 Add a component

For example we want to add a HBase RegionServer

// add the component
[vagrant@gw ~]$ curl -u admin:admin -H "X-Requested-By:ambari" -i -X POST http://gw.example.com:8080/api/v1/clusters/hdp-cluster/hosts/gw.example.com/host_components/HBASE_REGIONSERVER

// then install
[vagrant@gw ~]$ curl -u admin:admin -H "X-Requested-By:ambari" -i -X PUT -d '{"RequestInfo": {"context": "Install RegionServer","query":"HostRoles/component_name.in('HBASE_REGIONSERVER')"}, "Body":{"HostRoles": {"state": "INSTALLED"}}}' http://gw.example.com:8080/api/v1/clusters/hdp-cluster/hosts/gw.example.com/host_components

 Get host components for a service

[vagrant@gw ~]$ curl -u admin:admin -H "X-Requested-By:ambari" -i -X GET http://gw.example.com:8080/api/v1/clusters/hdp-cluster/hosts?host_components/HostRoles/service_name=HDFS&fields=host_components/HostRoles/service_name