What is TensorFlow?
TensorFlow is a software library, open source since 2015, of numerical computation developed by Google. The particularity of TensorFlow is its use of data flow graphs.
TensorFlow was developed by researchers and engineers from Google to carry out research projects in machine learning as well as on the subject of deep neural networks. The system is nonetheless generalistic enough to be applicable in a wide range of application domains.
TensorFlow can be used to drive models requiring large volumes of data (eg image banks) in an optimal way.
TensorFlow’s flexible architecture makes it possible to deploy the calculation on one or more CPU / GPUs on a personal computer, a server, and so on, without having to rewrite code.
Google has also developed Tensor Processing Units (or TPUs) built specifically for automated learning and for use with TensorFlow. TPUs are intended to use and test models rather than train them. Since February 2018, the TPUs have been available on the beta version of Google Cloud Platform.
TensorFlow is based on the DistBelief infrastructure (Google 2011) and has a Python interface that comes in an atypical low-level form (more adapted to the machine architecture) compared to the usual uses of Python. So do not panic if you need some adaptation time for TensorFlow !
Many companies and applications today use TensorFlow. Among them, Airbnb, Nvidia, Uber, Dropbox, Ebay, Google (of course), Snapchat, Twitter … and many more!
Thanks to the datalab, TensorFlow can be used on the GCP, either on the default configuration, or by customizing a virtual machine: choice of the number of cores, choice CPU / GPU etc.
How does TensorFlow look?
As mentioned above, TensorFlow represents the calculations as an execution graph.
The nodes of the graph represent mathematical operations, such as addition, multiplication, matrix multiplication, functional derivation, and so on.
The edges of the graph represent the tensors, communicating between the nodes. A tensor can be for example an integer, a vector, an image, etc. Each node of the graph thus takes in input different tensors, carries out its computation, then returns new tensors.
Example:
1 + 4 = 5
deriv (2x) = 2
The code associated with TensorFlow is divided into two main stages, construction and execution. During the construction phase, the variables and operations of the graph are defined and assembled. The graph creation is then automatically managed by TensorFlow to allow optimization and parallelization of code and execution.
The execution phase uses a session to execute the graph operations. A graph will only execute operations after creating a session. A session is used to place the operations of the graph in the components (CPU / GPU / TPU) and provide methods to execute them. The start of the graph operations is done via the run () method of the session as we will see a little later. This graph execution system is one of the fundamental properties of TensorFlow and allows you to execute all the operations of the graph at one time.
TensorFlow has many options for Deep Learning, making it easy to build a neural network, use it, and train it optimally.
Google Cloud Platform
Google Cloud Platform (or GCP) is an online platform developed by Google. This platform provides services for creating virtual machines and networks.
What can we do ?
- calculate
- store
- do automated learning / deep learning
And much more !
In this mini-tutorial, we will teach you how to:
- open a free account on GCP
- use datalab on GCP via a personal computer under GNU / Linux
- write your first TensorFlow program
Create an account on GCP
How to access GCP?
You have access to a free credit of about 300 $ / 250 € offered on the platform valid for one year. You will be asked to fill in the details of your credit card. Your account will NOT be charged. This is a precautionary measure against misuse of GCP.
Then,
- click Discover console. A tutorial starts to accompany your handling of the platform.
- create / select a project
- follow the instructions
- click Go to console
Create (or select) a project on GCP by clicking
https://console.cloud.google.com/cloud-resource-manager
Click the following link to enable Google Compute Engine and Cloud Source Repositories APIs for the selected project:
https://console.cloud.google.com/flows/enableapi?apiid=compute,sourcerepo.googleapis.com
Installing gcloud on your computer
We will now install Google Cloud SDK on your computer. Google Cloud SDK does not work with Python 3 yet. Verify that you have a version of Python 2 of type Python 2.7.9 or later on your computer using the command
python2 -V
If necessary, upgrade your Python
sudo apt-get install python2.7
Now, download one of the following packages depending on the architecture of your machine:
You can extract the file wherever you want in your file system.
tar zxvf filename.tar.gz
Then type the following two lines of commands
source ‘./google-cloud-sdk/path.bash.inc’
source ‘./google-cloud-sdk/completion.bash.inc’
Finally run gcloud init to initialize the SDK.
./google-cloud-sdk/bin/gcloud init
Follow the configuration instructions: choose / create a project, choose a zone etc.
You can now access help:
gcloud -help
Access to the datalab
Now update gcloud:
gcloud components update
Install gcloud datalab component now.
gcloud components install datalab
Now create a Cloud Datalab instance. The instance name must start with a lowercase letter, followed by a maximum of 63 lowercase letters, numbers, or hyphens.
Warning ! The name of the instance cannot end with a hyphen.
datalab create name_of_your_instance
Follow the instructions. Choose a geographic area close to where you are. You must enter a sentence to generate a public key.
Warning ! If the process is too long or fails, feel free to restart the instance creation. During this step, there are often connection problems.
If you go back to your GCP dashboard, and click on the 3 bars at the top left, then Compute Engine, your instance appears as active.
Finally, when the connection to the datalab is established, open the Cloud Datalab homepage in your browser with the following link: http://localhost:8081
You have access to a Jupyter notebook. Create a new Jupyter notebook by clicking on the ‘+’ symbol to the left of Notebook.
Note: to restart an instance after exiting
datalab connect name_of_your_instance
We now offer a micro tutorial introduction to TensorFlow on the Google Cloud Platform.
My first program in TensorFlow on GCP
1) Hello world!
Type the following command lines in your Jupyter notebook. To validate a cell, type Shift + Enter.
import tensorflow as tf # Import the TensorFlow library
hello = tf.constant (‘Hello!’) # Definition of a hello constant containing the string ‘Hello!’
If you type hello in your notebook, you get
<tf.Tensor ‘Const: 0’ shape = () dtype = string>
This is normal, it lacks a fundamental TensorFlow brick that we discussed earlier in this article, the session via TensorFlow’s tf.Session () function.
session = tf.Session () # Create a session
An error message may appear, ignore it.
session.run (hello) # Running the session
‘Hello!’
This small example introduces two essential elements for any program in TensorFlow, the session () and run () functions.
2) Basic mathematical operations
x = tf.constant (3) # Definition of the constant x = 3
y = tf.constant (2) # Definition of the constant x = 2
X = tf.constant ([1,0], shape = (2,1)) # Definition of the constant vector X
M = tf.constant ([1,1,2,2], shape = (2,2)) # Definition of the constant matrix M
result_1 = tf.add (x, y) # Addition
result_2 = tf.multiply (x, y) # Multiplication
resultat_3 = tf.matmul (M, X) # Matrix multiplication
session = tf.Session () # Create a session
session.run (result_1) # Run the session
session.run (result_2) # Run the session
session.run (result_3) # Run the session
As before, do not forget to run the session!
3) Variables and initialization
In TensorFlow, variables are defined and manipulated by tf.Variable ().
x = tf.Variable (0)
A variable represents a tensor whose value can change by running a calculation.
x = tf.constant (0)
y = tf.Variable (x + 1)
Note: Variables must be initialized before the graph can use them. The function tf.global_variables_initializer () is used for this purpose.
A small example of a program:
import tensorflow as tf
x = tf.constant (0)
y = tf.Variable (x + 1)
initialization = tf.global_variables_initializer () # Initializing variables
with tf.Session () as session:
session.run (initialization)
print (session.run (y))
Example of use: A Variable can be used to hold the weights w of a neural network, and is thus updated during the training of the model.
4) Placeholders
A Placeholder is a Tensor created via the tf.placeholder () method. When it is created, we do not assign any precise value to the Placeholder, we only specify the type of data and their dimensions
x = tf.placeholder (tf.float32, shape = (1024, 1024)) # creation of placeholder x of type float32 and dimension (1024, 1024)
A Placeholder can be seen as a Variable that will only receive its data later in the program, during the execution part. The value is set during the run of a calculation.
x = tf.placeholder (tf.float32, shape = (1024, 1024)) # creating the placeholder x
y = tf.matmul (x, x) #
with tf.Session () as sess:
rand_array = np.random.rand (1024, 1024) # rand_array definition
print (sess.run (y, feed_dict = {x: rand_array})) # give x the value rand_array during execution
Example of use: at each iteration of the training of a neural network, a Placeholder is used to feed the model with a new batch of images.
Conclusion
You now have a credit of € 250 on GCP and you also know how to access the datalab from your computer. Finally, you have written your first lines of code in TensorFlow and you have the basics to understand advanced codes written in this language.
To go further, you’ll find in the GCP datalab more complex examples of TensorFlow applications, as well as other automated learning tutorials.
Enjoy your exploration!
References and sources
https://www.tensorflow.org/
https://github.com/tensorflow/tensorflow
https://cloud.google.com/
https://cloud.google.com/datalab/docs/quickstart
https://cloud.google.com/solutions/running-distributed-tensorflow-on-compute-engine
https://www.nvidia.fr/daa-center/gpu-accelerated-applications/tensorflow/
https://blog.xebia.fr/2017/03/01/tensorflow-deep-learning-episode-1-introduction/
https://learningtensorflow.com/lesson2/
-
-
-