{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "___\n", "\n", " \n", "___" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Tensorflow with Estimators\n", "\n", "As we saw previously how to build a full Multi-Layer Perceptron model with full Sessions in Tensorflow. Unfortunately this was an extremely involved process. However developers have created Estimators that have an easier to use flow!\n", "\n", "It is much easier to use, but you sacrifice some level of customization of your model. Let's go ahead and explore it!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get the Data\n", "\n", "We will the iris data set.\n", "\n", "Let's get the data:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "df = pd.read_csv('iris.csv')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)target
05.13.51.40.20.0
14.93.01.40.20.0
24.73.21.30.20.0
34.63.11.50.20.0
45.03.61.40.20.0
\n", "
" ], "text/plain": [ " sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) \\\n", "0 5.1 3.5 1.4 0.2 \n", "1 4.9 3.0 1.4 0.2 \n", "2 4.7 3.2 1.3 0.2 \n", "3 4.6 3.1 1.5 0.2 \n", "4 5.0 3.6 1.4 0.2 \n", "\n", " target \n", "0 0.0 \n", "1 0.0 \n", "2 0.0 \n", "3 0.0 \n", "4 0.0 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "df.columns = ['sepal_length','sepal_width','petal_length','petal_width','target']" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "X = df.drop('target',axis=1)\n", "y = df['target'].apply(int)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Train Test Split" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Estimators\n", "\n", "Let's show you how to use the simpler Estimator interface!" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\Marcial\\Anaconda3\\lib\\site-packages\\h5py\\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n", " from ._conv import register_converters as _register_converters\n" ] } ], "source": [ "import tensorflow as tf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Feature Columns" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='object')" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X.columns" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "feat_cols = []\n", "\n", "for col in X.columns:\n", " feat_cols.append(tf.feature_column.numeric_column(col))" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[_NumericColumn(key='sepal_length', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),\n", " _NumericColumn(key='sepal_width', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),\n", " _NumericColumn(key='petal_length', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),\n", " _NumericColumn(key='petal_width', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "feat_cols" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Input Function" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# there is also a pandas_input_fn we'll see in the exercise!!\n", "input_func = tf.estimator.inputs.pandas_input_fn(x=X_train,y=y_train,batch_size=10,num_epochs=5,shuffle=True)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Using default config.\n", "WARNING:tensorflow:Using temporary folder as model directory: C:\\Users\\Marcial\\AppData\\Local\\Temp\\tmp3_l8l99d\n", "INFO:tensorflow:Using config: {'_model_dir': 'C:\\\\Users\\\\Marcial\\\\AppData\\\\Local\\\\Temp\\\\tmp3_l8l99d', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_service': None, '_cluster_spec': , '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}\n" ] } ], "source": [ "classifier = tf.estimator.DNNClassifier(hidden_units=[10, 20, 10], n_classes=3,feature_columns=feat_cols)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Calling model_fn.\n", "INFO:tensorflow:Done calling model_fn.\n", "INFO:tensorflow:Create CheckpointSaverHook.\n", "INFO:tensorflow:Graph was finalized.\n", "INFO:tensorflow:Running local_init_op.\n", "INFO:tensorflow:Done running local_init_op.\n", "INFO:tensorflow:Saving checkpoints for 0 into C:\\Users\\Marcial\\AppData\\Local\\Temp\\tmp3_l8l99d\\model.ckpt.\n", "INFO:tensorflow:loss = 15.285385, step = 1\n", "INFO:tensorflow:Saving checkpoints for 50 into C:\\Users\\Marcial\\AppData\\Local\\Temp\\tmp3_l8l99d\\model.ckpt.\n", "INFO:tensorflow:Loss for final step: 3.4342575.\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "classifier.train(input_fn=input_func,steps=50)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model Evaluation\n", "\n", "** Use the predict method from the classifier model to create predictions from X_test **" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": true }, "outputs": [], "source": [ "pred_fn = tf.estimator.inputs.pandas_input_fn(x=X_test,batch_size=len(X_test),shuffle=False)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Calling model_fn.\n", "INFO:tensorflow:Done calling model_fn.\n", "INFO:tensorflow:Graph was finalized.\n", "INFO:tensorflow:Restoring parameters from C:\\Users\\Marcial\\AppData\\Local\\Temp\\tmp3_l8l99d\\model.ckpt-50\n", "INFO:tensorflow:Running local_init_op.\n", "INFO:tensorflow:Done running local_init_op.\n" ] } ], "source": [ "note_predictions = list(classifier.predict(input_fn=pred_fn))" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'class_ids': array([2], dtype=int64),\n", " 'classes': array([b'2'], dtype=object),\n", " 'logits': array([-3.6269774 , 0.16824062, 1.2134217 ], dtype=float32),\n", " 'probabilities': array([0.00581369, 0.2586391 , 0.7355472 ], dtype=float32)}" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "note_predictions[0]" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": true }, "outputs": [], "source": [ "final_preds = []\n", "for pred in note_predictions:\n", " final_preds.append(pred['class_ids'][0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** Now create a classification report and a Confusion Matrix. Does anything stand out to you?**" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from sklearn.metrics import classification_report,confusion_matrix" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[20 0 0]\n", " [ 0 6 0]\n", " [ 0 0 19]]\n" ] } ], "source": [ "print(confusion_matrix(y_test,final_preds))" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " precision recall f1-score support\n", "\n", " 0 1.00 1.00 1.00 20\n", " 1 1.00 1.00 1.00 6\n", " 2 1.00 1.00 1.00 19\n", "\n", "avg / total 1.00 1.00 1.00 45\n", "\n" ] } ], "source": [ "print(classification_report(y_test,final_preds))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Great Job!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 1 }