Floydhub Documentation

Sun 20 August 2017 by Aditya Arora, Akshita Gupta

Floyd is a great cloud computing platform.I have used it for running a lot of my deep learning models. The main advantage of using Floyd is that you don't require a Credit Card and that is the main reason for giving a try to Floyd.

Floyd has a pretty handy documentation at Floydhub which is updated at regular intervals.

Initiallly I wrote this blog when there were a lot of confusions but now it is a very crisp and easy to use docs page. This post contains steps to Get up and running with Floyd on Windows.

Python Machine Learning Setting up Floyd

Setting up floyd for using on Windows.

Account

Make your account at Floydhub.

You can get a free account which gives 10GB of storage and 20 hours of CPU time. You can upgrade to any of their paid plans if you want to use their GPU and continue using their services.

Installation

Install the floyd-cli using

$ pip install -U floyd-cli

Login

Make sure you are logged in to your browser before continuing with this step.

$ floyd login

Copy and paste your authentication token in your terminal

Run project

  1. Make a project folder

    $ mkdir myproject
    
  2. Change to project folder

    $ cd myproject
    
  3. Make a folder for data

    $ mkdir <mydata>
    
  4. Make a folder for code

    $ mkdir <mycode>
    

    Paste all your data in <mydata> and your code in <mycode>.

  5. Change to data directory

    $ cd <mydata>
    
  6. Initialise data

    $ floyd data init sent-data
    
  7. Upload your dataset to Floyd

    $ floyd data upload
    

    Floyd will generate a data id for the uploaded dataset. This uploaded dataset can be used in your future experiments, if needed, using this data id.
    Output

    Creating data source. Uploading files ...
    DATA ID                 NAME                    VERSION
    ----------------------  --------------------  ---------
    GY3QRFFUA8KpbnqvroTPPW  alice/sent-data:1            1
    
  8. Change to the code directory

    $ cd ../mycode
    
  9. Run the code

    $ floyd run --data <DATA_KEY>:<MOUNT> --env theano-0.9:py2 "python filename.py"
    

    This would run the code in the cloud.
    Example code

    $ floyd run --data GY3QRFFUA8KpbnqvroTPPW:mydata --env theano-0.9:py2 "python mlp.py"
    
  10. Check the status of code

    $ floyd logs <RUN_ID>
    

Some common problems:

  • Should I put my code and data in separate folders

Note: Adding the codes in a separate folder prevents re-loading of data in the memory of floyd server. Floyd server synchronizes the WHOLE FOLDER in which the code is located when you run the code using the above command, which is actually a waste of (upload) time and a problem of data redundancy as you have already uploaded the data on the server.

  • How to link data in my script
    x_data = np.load('/nfiles/fileone.npy')
    y_data = np.load('/nfiles/filetwo.npy')
    

Here, nfiles is my MOUNT point.

  • If you are using Anaconda prompt and you come across the following error after running the command: "pip install -U floyd-cli": scandir could not be installed
    $ conda install -c conda-forge scandir=1.5
    

If you have any issues kindly raise an issue in the GitHub repo.


Comments