Fix HTML tags

kplachkov · kplachkov · commit 66522ea2ac55 · 2018-09-22T21:21:29.000+03:00
diff --git a/Recurrent Neural Networks/LSTM-Language-Modelling.ipynb b/Recurrent Neural Networks/LSTM-Language-Modelling.ipynb
@@ -18,12 +18,12 @@
     "## What exactly is Language Modelling?\n",
     "Language Modelling, to put it simply, **is the task of assigning probabilities to sequences of words**. This means that, given a context of one or a few words in the language the model was trained on, the model should have a knowledge of what are the most probable words or sequence of words for the sentence. Language Modelling is one of the tasks under Natural Language Processing, and one of the most important.\n",
     "\n",
-    "<img src=\"https://ibm.box.com/shared/static/1d1i5gub6wljby2vani2vzxp0xsph702.png\" width=\"768\"/>\n",
+    "<img src=\"https://ibm.box.com/shared/static/1d1i5gub6wljby2vani2vzxp0xsph702.png\"/>\n",
     "<center>*Example of a sentence being predicted*</center>\n",
     "\n",
     "In this example, one can see the predictions for the next word of a sentence, given the context \"This is an\". As you can see, this boils down to a sequential data analysis task - you are given a word or a sequence of words (the input data), and, given the context (the state), you need to find out what is the next word (the prediction). This kind of analysis is very important for language-related tasks such as **Speech Recognition, Machine Translation, Image Captioning, Text Correction** and many other very relevant problems. \n",
     "\n",
-    "<img src=\"https://ibm.box.com/shared/static/az39idf9ipfdpc5ugifpgxnydelhyf3i.png\" width=\"1080\"/>\n",
+    "<img src=\"https://ibm.box.com/shared/static/az39idf9ipfdpc5ugifpgxnydelhyf3i.png\"/>\n",
     "<center>*The above example schematized as an RNN in execution*</center>\n",
     "\n",
     "As the above image shows, Recurrent Network models fit this problem like a glove. Alongside LSTM and its capacity to maintain the model's state for over one thousand time steps, we have all the tools we need to undertake this problem. The goal is to create a model that can reach **low levels of perplexity** on our desired dataset.\n",
@@ -38,11 +38,9 @@
     "The dataset is divided in different kinds of annotations, such as Piece-of-Speech, Syntactic and Semantic skeletons. For this example, we will simply use a sample of clean, non-annotated words (with the exception of one tag - `<unk>`, which is used for rare words such as uncommon proper nouns) for our model. This means that we just want to predict what the next words would be, not what they mean in context or their classes on a given sentence. \n",
     "<br/>\n",
     "<div class=\"alert alert-block alert-info\">\n",
-    "<center>the percentage of lung cancer deaths among the workers at the west `<unk>` mass. paper factory appears to be the highest for any asbestos workers studied in western industrialized countries he said \n",
+    "the percentage of lung cancer deaths among the workers at the west `<unk>` mass. paper factory appears to be the highest for any asbestos workers studied in western industrialized countries he said \n",
     " the plant which is owned by `<unk>` & `<unk>` co. was under contract with `<unk>` to make the cigarette filters \n",
     " the finding probably will support those who argue that the u.s. should regulate the class of asbestos including `<unk>` more `<unk>` than the common kind of asbestos `<unk>` found in most schools and other buildings dr. `<unk>` said\n",
-    "    \n",
-    "</center>\n",
     "</div>\n",
     "<center>*Example of text from the dataset we are going to use, `ptb.train`*</center>"
    ]
@@ -55,11 +53,9 @@
     "\n",
     "For better processing, in this example, we will make use of [**word embeddings**]( [https://www.tensorflow.org/tutorials/word2vec/), which are **a way of representing sentence structures or words as n-dimensional vectors (where n is a reasonably high number, such as 200 or 500) of real numbers**. Basically, we will assign each word a randomly-initialized vector, and input those into the network to be processed. After a number of iterations, these vectors are expected to assume values that help the network to correctly predict what it needs to - in our case, the probable next word in the sentence. This is shown to be very effective in Natural Language Processing tasks, and is a commonplace practice.\n",
     "<br/><br/>\n",
-    "<font size = 4>\n",
-    "    <strong>\n",
+    "<strong>\n",
     "$$Vec(\"Example\") = [0.02, 0.00, 0.00, 0.92, 0.30,...]$$\n",
-    "    </strong>\n",
-    "</font>\n",
+    "</strong>\n",
     "<br/>\n",
     "Word Embedding tends to group up similarly used words *reasonably* together in the vectorial space. For example, if we use T-SNE (a dimensional reduction visualization algorithm) to flatten the dimensions of our vectors into a 2-dimensional space and use the words these vectors represent as their labels, we might see something like this:\n",
     "\n",