Some nix-trix
So today I tried to run some stuff on a Ubuntu server from my Windows 10 machine. I did it in four steps that I liked, and that I wanted to write down for future reference.
I take it from memory, so I’m almost certain there is minor issues in the code below. But the main idea should be correct still.
1) Connecting via WSL
So the first thing I did was install WSL on Windows 10, and then Ubuntu on that. So Now I had Ubuntu myself, with ssh and all other stuff included.
Since I also use WSL sometimes for running applications on my own machine (easy installation of Perl applications (latexdiff
), Ruby kramdown
) and so on, this was a minor thing.
I connect via ssh
# create SSH session 1
$ ssh username@server
username@server's password:
2) Reproducing environment via git and conda
All code and config was put in a git
repo.
On the server I created a repo where I wanted to create the environment. I made sure there was a master branch, because I didn’t realize how to push to a bare repo. I guess that step can be made smoother.
It was also important to detach the head so that the master branch was no the same as the working tree. If this step is omitted, others cannot push onto the master branch.
# in SSH session 1
$ mkdir ~/foo
$ cd ~/foo
$ git init .
$ touch readme.md
$ git add .
$ git commit -m "created master branch"
$ git checkout --detach
Then on my windows machine, I created a remote, checked out, rebased, and pushed.
PS> cd path/to/stuff
PS> git remote add serverName ssh://username@server/~/foo
PS> git fetch --all serverName
PS> git checkout master
PS> git rebase serverName/master
PS> git push --set-upstream servername master
then on the server again, checked out the master branch (but as detached head!), installed all dependencies and ran the code.
# in SSH session 1
$ git checkout master~0
$ conda env create --file=environment.yml
$ python whatever.py
So that is quite nice and simple. There was some annoyances with fixing the remote tracking properly, but this was all okay. On subsequent runs I changed conda env create --file=environment.yml
to conda env update --file=environment.yml --prune
.
3) Starting the job via tmux for long runs
The last command python whatever.py
was a long running job, and long running jobs over ssh is not a good idea. If ever the connection is closed, a hang up
is issued and running programs are stopped. There are ways to deal with that, such as the nohup
POSIX command. But a friend of mine, Tomas Aschan, had a better idea: to use tmux.
tmux can do several things, but what is relevant for this context it can create terminals that live a separate “lifecycle”. The terminals can be detached and reattached. Programs run in that terminal don’t know they are created via ssh, so it is safe to close the connection and tmux lives on in the background running the program for you. The relevant commands can be found on most tmux cheat sheets, such as this one
On the server, I run
# in SSH session 1
$ tmux new -s my_session
$ python whatever.py
Connect with a new ssh connection, and run
# in SSH session 2
$ tmux ls
my_session: 1 windows (created Wed Feb 3 11:43:17 2021) [120x29] (attached)
you see that the first ssh connection is still open, so the tmux session is attached. If you then close the ssh connection you find
# in SSH session 2
$ tmux ls
my_session: 1 windows (created Wed Feb 3 11:43:17 2021) [120x29]
$ tmux a # attach to the last session
4) Getting stuff back
Lets say some output files have been generated in ~/foo/output
, and you want to get them back to the Windows 10 machine.
This is easily done with scp
. Start the WSL Ubuntu shell and run
$ cd /mnt/c/Users/Ludvig/Desktop/
$ mkdir output
$ cd output
$ scp -r username@server/~/foo/output .
Voila! The -r
flag makes sure the copy is recursive.