Philip Hutchins

Head in the cloud...

Find IP of Network Share Host

Finding the IP address of a device on your network is generally an easy thing to do. This isn’t always the case when the device is a network device or share using mDNS, also known as Bonjour on OSX.

If you know the network name of the device, you can use the dscacheutil to query the device and get it’s IP address.

1
dscacheutil -q host -a name [network name].local

For example, I have a NAS on my network but the nas requires an older display connector that I generally don’t keep around. Unfortunately it lives on a network to which we don’t control the IP space. When the IP of the device changes, its not been so easy to find it’s address. This is what I used to find it’s IP.

1
2
3
4
$ dscacheutil -q host -a name storjnas.local
name: storjnas.local
ip_address: 10.150.50.50
ip_address: 10.150.50.50

Java Trust Anchors Error When Installing ES Plugins

The Error

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
root@elasticsearch-1:/opt/elasticsearch/elasticsearch# ./bin/elasticsearch-plugin install repository-gcs
-> Downloading repository-gcs from elastic
Exception in thread "main" javax.net.ssl.SSLException: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
          at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
          at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
          at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
          at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1926)
          at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1921)
          at java.security.AccessController.doPrivileged(Native Method)
          at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1920)
          at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1490)
          at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
          at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254)
          at org.elasticsearch.plugins.InstallPluginCommand.downloadZip(InstallPluginCommand.java:279)
          at org.elasticsearch.plugins.InstallPluginCommand.downloadZipAndChecksum(InstallPluginCommand.java:322)
          at org.elasticsearch.plugins.InstallPluginCommand.download(InstallPluginCommand.java:231)
          at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:210)
          at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:195)
          at org.elasticsearch.cli.SettingCommand.execute(SettingCommand.java:54)
          at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122)
          at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:69)
          at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122)
          at org.elasticsearch.cli.Command.main(Command.java:88)
          at org.elasticsearch.plugins.PluginCli.main(PluginCli.java:47)

The Reason

The message means that the trust store that you specified or was specified for you could not be opened due to access, permissions, or due to the fact that it doesn’t exist.

To fix this, you need the ca-certificates-java package which is not explicitly installed by the Oracle JDK/JRE. Also, it may be installed but you still have to manually run the configuration for it.

The Solution

The solution is to run the configuration for this package. Make sure to install the package if it hasn’t already been installed.

sudo /var/lib/dpkg/info/ca-certificates-java.postinst configure

SSH Too Many Authentication Failures

SSH by default will always try all keys known by the agent in addition to any identity files. To keep this from happening, you can specify IdentitiesOnly=yes in addition to specifying a particular key to use when authenticating.

To compare…

This command: ssh -i ~/.ssh/id_rsa me@myserver hostname

will give you Received disconnect from myserver: 2: Too many authentication failures for me

However, if you add IdentitiesOnly=yes like so… ssh -o IdentitiesOnly=yes -i ~/.ssh/id_rsa me@myserver hostname

you will see…

myserver

SSH Agent

Generally this is caused by having more than 5 keys loaded in ssh-agent.

You can remove keys by running ssh-add -d ~/.ssh/[key]

Docker Tips and Tricks

Programatically Getting Container Info

IP’s & Ports

Get all containers IP addresses with container name

for docker

1
docker inspect -f ' - ' $(docker ps -aq)

for docker compose

1
docker inspect -f ' - ' $(docker ps -aq)

Alias for getting the docker VM IP

Fixing Docker: No Space Left on Device

The Problem

If you’re runing docker on OSX more than just a little, you’ve probably run into an issue where you’re building a container and it fails due to an error that looks something like the following…

1
Error response from daemon: mkdir /var/lib/docker/tmp/docker-builder983959066: no space left on device

From this point on, you won’t be able to build new containers or images until… 1) you delete existing contaimers or images or 2) completely delete the virtual disk image that contains all of your docker containers and images

The Breakdown

Ok, so lets look at what this error is telling us. The context is important here.

On OSX, when building a docker container or image, the work is being done inside of a virtual machine. If you’ve looked at /var/lib/docker… on your local machine, you may have noticed that its either not there or it is but it isn’t full. The reason for this is becuase the /var/lib/docker folder that the error is referring to lives inside of the virtual machine in which docker for mac or docker-machine is doing its building.

The virtual file system lives on your disk here: /Users/philip/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/Docker.qcow2

By default this virtual disk is 20G.

Troubleshooting

To confirm that this is really your issue, here are the steps that I used. Your disk size and space used will be different as I ran these after resolving the issue. Either way, this will give you a good idea of how it all works.

Checking the Size of the Virtual Disk Image File

If you check the size of this file…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ ls -altrh
total 47009328
-rw-r--r--   1 philip  staff    36B Nov 17 11:14 nic1.uuid
-rw-r--r--   1 philip  staff     0B Nov 17 11:14 lock
drwxr-xr-x  83 philip  staff   2.8K Jan  1 13:09 log/
lrwxr-xr-x   1 philip  staff    12B Jan  3 14:39 tty@ -> /dev/ttys004
-rw-r--r--   1 philip  staff     5B Jan  3 14:39 pid
-rw-r--r--   1 philip  staff    17B Jan  3 14:39 mac.0
-rw-r--r--   1 philip  staff     5B Jan  3 14:39 hypervisor.pid
drwxr-xr-x  21 philip  staff   714B Jan  3 14:39 ../
drwxr-xr-x  12 philip  staff   408B Jan  3 14:39 ./
-rw-r--r--   1 philip  staff   705B Jan  3 14:39 syslog
-rw-r--r--   1 philip  staff    64K Jan  3 20:12 console-ring
-rw-r--r--   1 philip  staff    22G Jan  4 07:11 Docker.qcow2

… you’ll notice that it’s 22G

Check Space Used From Within the VM

Then if you take a peek at the space used from within a container, it’s slightly different but fairly close… (pay attention to the root partition in this context)

1
2
3
4
5
6
7
docker run --rm --privileged debian:jessie df -h
Filesystem      Size  Used Avail Use% Mounted on
none             33G   20G   12G  63% /
tmpfs           999M     0  999M   0% /dev
tmpfs           999M     0  999M   0% /sys/fs/cgroup
/dev/vda1        33G   20G   12G  63% /etc/hosts
shm              64M     0   64M   0% /dev/shm

While I was having this issue, I was unable to run this command due to lack of space so you may not be able to do this until you fix the issue.

The Fix

To fix this issue, we will need to do the following.

  • Stop Docker
  • Expand the disk image size
  • Launch a VM in which we run GParted against the Docker.qcow2 image
  • Expand the partition to use the additional space added to the disk image
  • Exit the VM and restart docker

Stop Docker

Lets go ahead and stop docker so that the disk image is not being used while we resize. This may not be required but better safe than sorry…

Install QEMU

To launch the virtual machine, you’ll need QEMU or something that can boot from an ISO and mount a qcow2 image. For this example, I’m using QEMU.

1
brew install qemu

Expand the disk image size

To expand the disk image, we’ll use the qemu-img util packaged with Docker for MacOS. If you can’t find this on your system, you should be able to get this from the qemu package.

1
$ /Applications/Docker.app/Contents/MacOS/qemu-img resize ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/Docker.qcow2 +5G

If you would like to expand it more or less, you can change the +5G on the end of the command as needed.

Download the GParted Live Image

Visit http://gparted.org/download.php and download the gparted-live ISO for your architecture. In this case, I downloaded gparted-live-0.27.0-1-amd64.iso.

Launch the VM Running GParted

Here we run qemu and launch a virtual machine adding our Docker.qcow2 disk image as a drive.

  • When prompted, you’ll select the options for booting GParted Live.
  • Select don’t touch keymap (unless you know what you’re doing)
  • The next step should default to 33 (US English) so change it if needed, otherwise, hit enter
  • For mode, select start X & GParted automatically which should be default
  • Click on the GParted icon

While launching, I saw a warning stating overlayfs: missing 'workdir'. You can safely ignore this. Just be patient and let it finish booting.

It may take a bit for the machine to completely come up so give it some time…

1
$ qemu-system-x86_64 -drive file=Docker.qcow2  -m 512 -cdrom ~/Downloads/gparted-live-0.27.0-1-amd64.iso -boot d -device usb-mouse -usb

Expand the Partition

In GParted…

  • select the partition (it should be the largest one and should match the size you’ve seen when inspecting the image)
  • right click the partition and select resize/move
  • resize it to use the full amount of space allocated for the disk by dragging the right size of the darkened box to the far right of the block
  • click resize
  • click apply
  • close GParted and exit the VM

Start Docker

Start docker back up however you normally would start it.

Confirm it Worked

At this point you can go back and run your commands to check space used from within the VM and confirm that the available size has increased as expected. If it did, you should be good to go!

Installing Insteon USB PLM With OpenHAB

Install Dependencies

Connect, Test & Configure PLM

Connect the PLM

  • Plug the PLM into a USB port on your server/computer

Find the PLM Device

Look in /dev for your serial USB device

1
2
philip@cube:/dev$ ls | grep USB
ttyUSB0

In my case it was ttyUSB0. If you unplug and re-connect the PLM it may show up as a different device, such as ttyUSB1 so make sure to check if things stop working after you reconnect it.

Test the PLM

To test the PLM, we’ll be using Insteon Terminal which you will get from GitHub.

Lets install the dependencies for the Insteon Terminal which we will use to test and ensure the connection between the PLM and your computer are working.

  • Install ant, default-jdk and librxtx-java sudo apt-get install ant default-jdk librxtx-java

  • Add the user accessing the PLM device to the correct groups to allow permission If the user accessing the PLM is openhab use the following. You will need to check the current owner of the /dev/[yourdevice] file and the /run/lock folder and change dialout and lock below appropriately.

1
2
usermod -a -G dialout openhab
usermod -a -G lock openhab
  • Reboot I’ve found that I have to reboot in order to connect to the model properly. This should not be the case but it is the simplest solution for now. It’s possible that there is a permissions issue, dependency loading issue or something along those lines that gets resolved by a reboot.

  • Clone the Insteon Terminal repo from GitHub git clone https://github.com/pfrommerd/insteon-terminal.git

  • Copy the example config file and edit appropriately

1
2
3
4
5
6
7
$ cd insteon-terminal
$ cp init.py.example init.py
$ vi init.py # Or use your editor of choice here
$ # Follow the instructions in the config file to configure Insteon Terminal using the PLM device we found earlier

+ Launch Insteon Terminal
From the insteon-terminal folder, launch the terminal

./insteon-terminal

1
2

At this point you should see something like the following...

philip@cube:~/github/insteon-terminal$ ./insteon-terminal Insteon Terminal Connecting Connected Terminal ready!

1
2

If you do not see `Connected` and instead see something like this...

philip@cube:~/github/insteon-terminal$ ./insteon-terminal Insteon Terminal Connecting gnu.io.NoSuchPortException Terminal ready! “`

Continue to troubleshoot why you cannot connect to your PLM modem device

Elasticsearch Notes

Elasticsarch is a wonderful and powerful search and analytics engine. Operating an ES cluster or node is generally fairly easy and straight forward however there are a few situations where the resolution to seemingly common issues is not so clear. I will gather my notes and helper scripts here in an effort to help others better understand and resolve these certain issues and configurations quickly.

Stats

Determine the amount of space on each node and other storage related stats

1
curl -XGET 'http://localhost:9200/_nodes/stats' | jq '.nodes [] | .name, .fs'

Settings

Set number of replicas for all indices to 0

When you spin up a single node cluster, the default setting for number of replicas is 1. This means that the cluster is going to try to create a second copy of each shard. This is not possible as you only have one node in the cluster. This keeps your cluster (single node) in the yellow status and it will never reach green. A node can function this way but it is annoying to not see a green state when everything is actually healthy.

1
curl -XPUT 'localhost:9200/_settings' -d '{"index": { "number_of_replicas": 0 } }'

Recovering

Ran out of disk

When you run out of disk, shards will have not been allocated and your cluster will likely be stuck in status RED. To recover, you need to find out which indices are unassigned and assign them manually

Commands

Check your clusters health and status of unassigned shards

1
curl -XGET http://localhost:9200/_cluster/health?pretty=true

Display the indices health

1
curl -XGET 'http://localhost:9200/_cluster/health?level=indices&pretty'

Display shards

1
curl -XGET 'http://localhost:9200/_cat/shards'

Display all unassigned shards and reason for being unassigned

1
curl -XGET localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED

Helper Scripts

SSL Automation With LetsEncrypt in Kubernetes

Problem

When deploying services to Kubernetes, a certificate has to be injected into the container via secret. It doesn’t make sense to have each container renew it’s own certificates as it’s state can be wiped at any given time.

Solution

Build a service within each Kubernetes namespace that handles renewing all certificates used in that namespace. This service would kick off the request to renew each cert at a predetermined interval. It would then accept all verification requests ( GET request to domain/.well-known/acme-challenge ) and respond as necessary. After being issued the new certificate, it would recreate the appropriate secret which contains that certificate and initiate a restart of any container or service necessary.

Spec

SSL Renewal Container

To automate the creation and renewal of certificates, we will need to create container with Letsencrypt to request creation or renewal of each certificate, Nginx to receive and confirm domain validation, and scripts to push the generated certificates to secrets in Kubernetes. This container will be deployed to Kubernetes as a daemonset and should run in each of your Kubernetes clusters.

Container Creation & Setup

  • Nginx
  • LetsEncrypt (CertBot)

Pushing Secrets

  • kubectl
  • Access?

Restarting Services

  • kubectl

Domain List Configuration

SSL Ingress in Kubernetes

Previously to acieve using SSL/TLS in Kubernetes, we had to set up some sort of SSL/TLS termination proxy. With the addition of a few new features in Kubernetes 1.2 Ingress, we’re able to do away with the proxy and allow Kubernetes to handle this task.

Chef Provisioner SSL Errors

While setting up chef-provisioning to provision servers in Google Cloud, I ran into a pretty tricky bug which took a number of hours to troubleshoot.

Command I Was Running

chef-client -z elasticsearch-cluster.rb

The error…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Compiled Resource:
------------------
# Declared in /Users/philip/github/chef-storj/provisioners/elasticsearch-cluster.rb:21:in `from_file'

machine("elasticsearch-1") do
  action [:converge]
  retries 0
  retry_delay 2
  default_guard_interpreter :default
  chef_server {:chef_server_url=>"http://localhost:8889", :options=>{:api_version=>"0"}}
  driver "fog:Google"
  machine_options {:insert_options=>{:tags=>{:items=>["elasticsearch"]}, :disks=>[{:deviceName=>"elasticsearch-1", :autoDelete=>true, :boot=>true, :initializeParams=>{:sourceImage=>"projects/ubuntu-os-cloud/global/images/ubuntu-1404-trusty-v20150316", :diskType=>"zones/us-east1-b/diskTypes/pd-ssd", :diskSizeGb=>80}}, {:type=>"PERSISTENT", :mode=>"READ_WRITE", :zone=>"zones/us-east1-b", :source=>"zones/us-east1-b/disks/elasticsearch-1", :deviceName=>"elasticsearch-1"}]}, :key_name=>"google_default"}
  declared_type :machine
  cookbook_name "@recipe_files"
  recipe_name "/Users/philip/github/chef-storj/provisioners/elasticsearch-cluster.rb"
  run_list ["recipe[chefsj-elk::elasticsearch-1]"]
end

[2016-05-03T13:26:18-04:00] INFO: Running queued delayed notifications before re-raising exception

Running handlers:
[2016-05-03T13:26:18-04:00] ERROR: Running exception handlers
Running handlers complete
[2016-05-03T13:26:18-04:00] ERROR: Exception handlers complete
Chef Client failed. 0 resources updated in 04 seconds
[2016-05-03T13:26:18-04:00] FATAL: Stacktrace dumped to /Users/philip/.chef/local-mode-cache/cache/chef-stacktrace.out
[2016-05-03T13:26:18-04:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2016-05-03T13:26:18-04:00] ERROR: machine[elasticsearch-1] (@recipe_files::/Users/philip/github/chef-storj/provisioners/elasticsearch-cluster.rb line 21) had an error: Faraday::SSLError: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
[2016-05-03T13:26:19-04:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)

Testing SSL

Using the knife ssl check command, check the status of ssl between you and your chef server.

Obtaining an Updated cert.pem

1
curl http://curl.haxx.se/ca/cacert.pem -o /usr/local/etc/openssl/cert.pem

The Problem

The precompiled versions of ruby from RVM are pointing at G/etc/openssl/certs when looking for it’s ca certificate file. Newer versions of OSX have moved their certs to a different directory, or possibly /usr/local/etc/openssl/certs if you’ve installed openssl from brew or some other source.

The Solution

Reinstall ruby from source. rvm reinstall 2.2.1 --disable-binary

Uninstall all the chef gems gem uninstall chef chef-zero berkshelf knife-solo

Reinstall ChefDK

Links

Bash Tricks and Shortcuts

Loops

Often times you need to run the same task in bash against a number of different arguments. Loops in bash can make this very quick and easy.

One of the simplest ways you can do this in a one liner is as follows

1
2
3
4
5
$ for i in one two three four; do echo $i; done
one
two
three
four

You can also predefine an array to use later like this

1
2
3
4
5
files=( "/tmp/file_one" "/tmp/file_two" "/tmp/file_three" )
for i in "${files[@]}"
do
  echo $i
done

Or, to do this on one line

1
2
3
4
$ files=("/tmp/file_one" "/tmp/file_two" "/tmp/file_three" ); for i in "${files[@]}"; do echo $i; done
/tmp/file_one
/tmp/file_two
/tmp/file_three

You can use ranges with seq

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
for year in $(seq 2000 2013); do echo $year; done
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013

If you need a counter you could do something like this

1
2
3
4
5
6
7
8
9
10
11
12
#!/bin/bash
## declare an array variable
declare -a array=("one" "two" "three")

# get length of an array
arraylength=${#array[@]}

# use for loop read all values and indexes
for (( i=1; i<${arraylength}+1; i++ ));
do
  echo $i " / " ${arraylength} " : " ${array[$i-1]}
done

File Permissions

There are a few shortcuts that make life easier when working with file and directory permissions. Here are a few.

When you want to recursively change permissions in a directory, you will want to change the file permissions separately from the directory permissions. You can accomplish this by using two different find commands piped to xargs as follows.

1
2
$ find * -type d -print0 | xargs -0 chmod 0755 # for directories
$ find . -type f -print0 | xargs -0 chmod 0644 # for files

or

1
2
$ find /path/to/directory -type d -exec chmod g+rsx '{}' \;
$ find /path/to/files -type f -exec chmod g+rsx '{}' \;

Three permission triads

1
2
3
first triad       what the owner can do
second triad      what the group members can do
third triad       what other users can do

Each triad

1
2
3
4
5
first character   r: readable
second character  w: writable
third character   x: executable
                  s or t: executable and setuid/setgid/sticky
                  S or T: setuid/setgid or sticky, but not executable

References, Operators and Modifiers

Above, you can see that permissions can be changed using u, g, o and a. These represent references to User, Group, Other and All. + (u)ser: + The user is the owner of the files. The user of a file or directory can be changed with the chown [3]. command. + Read, write and execute privileges are individually set for the user with 0400, 0200 and 0100 respectively. Combinations can be applied as necessary eg: 0700 is read, write and execute for the user. + (g)roup: + A group is the set of people that are able to interact with that file. The group set on a file or directory can be changed with the chgrp [4]. command. + Read, write and execute privileges are individually set for the group with 0040, 0020 and 0010 respectively. Combinations can be applied as necessary eg: 0070 is read, write and execute for the group. + (o)ther: + Represents everyone who isn’t an owner or a member of the group associated with that resource. Other is often referred to as “world”, “everyone” etc. + Read, write and execute privileges are individually set for the other with 0004, 0002 and 0001 respectively. Combinations can be applied as necessary eg: 0007 is read, write and execute for other. + (a)ll: + Represents everyone

The operator is what is used to control adding or removing of modifiers + + Add the specified file mode bits to the existing file mode bits of each file + – removes the specified file mode bits to the existing file mode bits of each file + = adds the specified bits and removes unspecified bits, except the setuid and setgid bits set for directories, unless explicitly specified.

Modifiers + r read + w write + x execute (or search for directories) + X execute/search only if the file is a directory or already has execute bit set for some user + s setuid or setgid (depending on the specified references) + S setuid or setgid (depending on the specified references) without the executable bit (or search for directories) set + t restricted deletion flag or sticky bit

Octal

  • The read bit adds 4 to its total (in binary 100),
  • The write bit adds 2 to its total (in binary 010), and
  • The execute bit adds 1 to its total (in binary 001).

These values never produce ambiguous combinations; each sum represents a specific set of permissions. More technically, this is an octal representation of a bit field – each bit references a separate permission, and grouping 3 bits at a time in octal corresponds to grouping these permissions by user, group, and others.

SetUID, SetGID and the Stick Bit

SUID / Set User ID : A program is executed with the file owner’s permissions (rather than with the permissions of the user who executes it).

1
2
$ chmod  u+s testfile.txt
$ chmod 4750  testfile.txt

SGID / Set Group ID : Files created in the directory inherit its GID, i.e When a directory is shared between the users , and sgid is implemented on that shared directory , when these users creates directory, then the created directory has the same gid or group owner of its parent directory.

1
2
$ chmod g+s
$ chmod 2750

Sticky Bit : It is used mainly used on folders in order to avoid deletion of a folder and its content by other user though he/she is having write permissions. If Sticky bit is enabled on a folder, the folder is deleted by only owner of the folder and super user(root). This is a security measure to suppress deletion of critical folders where it is having full permissions by others.

1
2
3
$ chmod o+t /opt/ftp-data
$ chmod +t /opt/ftp-data
$ chmod 1757 /opt/ftp-dta

’S’ = The directory’s setgid bit is set, but the execute bit isn’t set. ’s’ = The directory’s setgid bit is set, and the execute bit is set.

These are represented in the ls -la (list all files in list format) by the following

1
2
3
4
5
6
7
Permissions Meaning
--S------   SUID is set, but user (owner) execute is not set.
--s------   SUID and user execute are both set.
-----S---   SGID is set, but group execute is not set.
-----s---   SGID and group execute are both set.
--------T   Sticky bit is set, bot other execute is not set.
--------t   Sticky bit and other execute are both set.

Permissions for Multi User Samba Directory

1
chmod -R u=rwX,g=rwXs,o=rX <share_root>