Over the last few months as I’ve approached natural language processing (NLP) I ploughed through several Udemy courses, covering software and mathematics, then set up a project and a preliminary portfolio.
Having a specific target for education within the computer sciences and STEM topics is both relieving and overwhelming.
A Short Affair with MATLAB, then on to Jupyter
This “MATLAB On Ramp” course gave me a good idea of how the popular MATLAB software functions. This software is dedicated to science topics and is useful for Linear Algebra.
The time I spent on the course, however, was only partial useful, as I eventually let go of MATLAB in favor of Jupyter software, instead.
I like how Jupyter allows for the user to use a full programming language, Python. The community of Python is large and enthusiastic, therefore by using Jupyter, and thus Python, I find myself progressing in more topics than one.
A Brief Taste of Linear Algebra
The “Complete Linear Algebra” course was interesting for the first third, but I found that the content moved too swiftly through the concepts for the course to be useful. I got the gist of Linear Algebra, and that’ll have to be enough for now.
At my current point in my own learning curve of NLP, I have so many disparate courses and topics that I’ve studied over the years, and pulling these topics together into working memory is a challenge.
I’ve studied computer science, business statistics, data science, distributed systems, networking, digital and decentralized team dynamics, creative and technical writing, presentation, and so many others.
(As an aside, I think it relevant to mention how NLP fits into many of the other studies I’ve approached. NLP is a niche field (growing rapidly) that is a subset of computer science and mathematics. To become proficient in NLP, a student must first gain confidence with data science, programming and machine learning, and then linguistics.)
The course above starts at a point that takes advantage of many of the other courses I’ve taken, and pulls many of the key topics therein into an approachable pathway into machine learning.
Typically, with Udemy courses I only complete about a two thirds of the video content, depending on how committed I feel once I have the majority of the study out of the way.
The machine learning bootcamp course, however, is one that I will probably finish at least 90%, as the content is strong and relevant.
Using Vuepress to Set Up My Preliminary Portfolio
Over the last few years I have used markdown formatting and Vuepress extensively in my professional work as a technical communicator and documentation writer.
Therefore, I feel it natural to activate Vuepress for the portfolio section of my website.
The portfolio itself is up and running, but I will wait to connect it to this website and blog until the presentation is cleaner.
Working with a Mentor
I’ve taken so many university courses over the last several years; alas, I am still years away from a new official degree. The road is long… and expensive. I’m not giving up, but I need to find some shortcuts to a more specific data-science or otherwise NLP-related income source.
While I continue working towards that goal, I hope to find a way to get paid to take classes, instead of the other way around. So, I’m looking for a slight shortcut or two, and to that end I hired a mentor from mentorcruise.com.
So far, our time has been productive. My mentor is helping me stay focused on natural language processing, with the intent of having a few high quality portfolio pieces to show by next summer.
First Attempt at a Data-Science Portfolio Piece
Just like the heading says, I did make one first attempt at a STEM topic portfolio piece.
The topic was to perform price predictions for housing data. I’ve done similar studies like this before, but only as a part of following along with a teacher or a video tutorial.
This time, I tried to take some raw data from kaggle.com and do a study all on my own.
Thus far, I’ve spent perhaps ten or fifteen hours on the project, and I’m not making much headway. My mind is stretching and I’m learning new things, but the project is revealing to me what I truly know and what I do not, and what I do not know in NLP currently vastly outweighs everything else.
But, with continued practice and work, I hope to change this over the next few months.
Until we meet again, dear reader, thank you for following.
When I launched this new site, I had thought not to let anyone know about it for awhile, as it’s new and I’m not really sure who among my old friends would be interested.
To my surprise, I found that MailChimp, the automated email-blasting service I once used to send old blog posts to my friends who have chosen to follow my story, detected this new website through a few small links in the data.
Eager to do its job, MailChimp captured and sent the previous blog post immediately, thus alerting everyone to my presence here.
So, hello, everyone! Surprise surprise. I’m back.
You are welcome to continue following along.
I do leave you with a “heads up,” moment, however.
I am adapting to reality, and that includes engaging in topics, such as math, science, and computer programming, in a way that I would not have engaged previously on my old blog.
Over time, I would not be surprised if this blog acquires many conversational threads that deal not just with the overview of technology, but even with some nitty-gritty details.
With that in mind, on with today’s update!
The Why Before the How
For a person to earn money and provide for a family, we must acquire a skill that is useful to society, and then sell that skill to those who either do not have the time or ability to perform the associated tasks.
As I am adapting in my journey to find new skills, I am continually coming face-to-face with the reality that computers are better at memory and repetition than an individual human person.
These abilities of a computer challenge us humans, as we ultimately cannot compete in any skill that relies on passive memory and repetition alone. We must find aspects of ourselves that we can activate that allow us to retain a degree of influence over these powerful machines.
When a person bases their income on skills that do not actively engage with computer-driven memory and repetition, that person may be more likely to suffer in the modern economy.
For an example of these facts, consider a hospital nurse. To obtain a nursing license, a nurse must spend many years developing her memory of human anatomy and medical procedures. She must then repeat what she has learned in practice, through various forms of hospital educational “residencies” and other means.
In these acts of developing her memory and then repeating what she has learned, she is bound by normal human limitations on memory and repetition. She must eat and sleep, she can only memorize a certain number of things per day, and she is bound to become bored of repetition at some point in her experience.
After many years, she can be an established nurse that has a developed mindset and daily pattern that allows her to maintain her job, and thus her connection to income, while also having enough freedom in life to relax and focus solely on the deeper aspects of the human experience when she returns home from work.
Much of what a nurse might do in her income-deriving skill set, however, can be memorized and repeated by a computer. The tasks a nurse performs are easily memorized in data storage, and as machine learning and robotics grows, the skills of a nurse are able to be automated.
For example, automated testing devices such as heart-rate and blood-pressure monitors replace the skill sets of a nurse from fifty years ago. Furthermore, nurses enter patient histories into computers and rely on these recordings to transfer information from one nurse to another.
How a nurse might adapt to this reality is a curious question, and to my knowledge, there are many solutions. Some nurses may rely on strong interpersonal skills, others might rely on unique certifications that are difficult to acquire and too expensive or complicated to automate, and others might establish a personal human network of friends and associates to protect their jobs.
My Current Relationship to Machine Learning
For my own part, as I seek to adapt my own career (with a long history and love for symbolism and storytelling), I seek to understand the heart of what is truly happening in the movement of automation itself.
How is automation affecting a human being’s unique, as the Hebrew scriptures call it, “Breath of Life?” How are automation and consciousness itself related.
One aspect of of my pursuit is to understand how machines learn and improve. In this spirit, I am a novice student in the art of Computer Science, and a subset of my interest lies in a field called Machine Learning.
Machine Learning is the specific industry that focuses on programming machines not with a specific set of easily identified tasks, but rather, with the ability to learn and adapt in an ever changing environment.
With a quick search on YouTube, I found the following video as a reference for more information, for readers who are curious.
Machine Learning, which I now refer to as “ML” for brevity, is itself a wide field.
One subfield of ML is that of programming robots. Trying to program a robot to move its body in the manner of a human is surprisingly difficult. Computer Scientists have learned that, rather than try to instruct a robot in every aspect of its own movement, its easier to provide the robot with a set of Machine Learning instructions that allows a robot to teach itself how to move on its own.
Here is an example from Google of using Machine Learning to allow a computer to teach itself how to walk.
These underlying simulations can then be applied to an actual physical robot, as follows.
Natural Language Processing
For my part, within the field of Machine Learning, I am interested in Natural Language Processing.
Computer Science > Machine Learning > Natural Language Processing
Natural Language Processing (NLP) focuses on training machines to interact with human language. There are many sub-fields within NLP, such as using a machine to detect if a human speaker is in a positive or negative emotional mood, teaching a computer to generate text and speech that so natural a human cannot detect that it is machine-generated, and many other sub-fields.
I’m not sure which sub-field specifically I would like to focus on, but anything that relates to my love for storytelling and symbolism will likely capture my attention.
Here is a small snippet of machine-generated text in which a machine first reads a human-written prompt, and then attempts to write a news article that naturally follows the prompts conversational direction.
Human Written Prompt
In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.
Machine Generated Response
The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science.
Now, after almost two centuries, the mystery of what sparked this odd phenomenon is finally solved.
Dr. Jorge Pérez, an evolutionary biologist from the University of La Paz, and several companions, were exploring the Andes Mountains when they found a small valley, with no other animals or humans. Pérez noticed that the valley had what appeared to be a natural fountain, surrounded by two peaks of rock and silver snow. …
I can see how this technology can be combined with fiction, mythology, and storytelling.
Before I can delve more deeply into Natural Language Processing, or “NLP,” I must first learn the basics of Machine Learning.
I have sought in my (sparse) free time to improve in Computer Science for several years now, and I try to avoid acquiring new learning tasks unless I am certain they will be useful.
Originally, in my pursuit of Machine Learning, I had wanted to stay close to programming languages with which I am already familiar, such as C++.
The field of Machine Learning, however, discourages the use of C++, as it is a complex language and this adds to the already challenging task of learning ML.
Instead, I keep hearing about this software called MATLAB. Many people, such as Andrew Ng of Google’s Machine Learning Lab, state that learning the art of Machine Learning is easier with MATLAB than with C++, and even easier with popular programming languages such as Python.
I finally decided to invest a few days in learning MATLAB.
The online MATLAB introductory course below is by industry expert Mike X Cohen, and I completed the course about two days ago. Mike’s content is fairly easy to follow here and I feel comfortable enough with MATLAB to now move back into other ML courses.
Introduction to Linear Algebra
There are a few additional prerequisite topics that I also need to study, before I can approach ML more specifically. These include Linear Algebra and Linear and Logistic Regression. The same online-course author, Mike Cohen, also has courses on these topics, and the former course is linked below.
Here is a link to Linear Algebra, which I wrapped up yesterday, and now I move into the Regression topics today.
These are not light topics of study, and I suspect that, if I am able to make a career out of ML and NLP, I will need to master these mathematical skills.
Mastery can take many months and years, and so for the immediate moment, I pursue the courses only so far as necessary to gain a sense of orientation, and then I drop them and move on.