Java Multithreading - Part 1

I brought up the topic of concurrency in the Sorting post(concurrent merge sort), but never really got to explain properly what it actually was.  So in this blog post, I will go into what concurrency means, Java's tools approach to enabling this concept, and some common problems that arise when dealing with concurrency.

The Concept

Concurrency is the ability to be able to do more than one task at a time.  Sometimes the word parallelism is also intertwined in this definition, but depending on who you talk to, may not always be the same definition.  The way I see the difference between the two terms, concurrency is the idea that different programs can overlap and interleave with one another to create an experience that appears to the user that multiple applications are being run at the same time, but is actually a trick played that gives small time slices of processing usage.  Parallelism is the idea that two programs can actually run at the exact same time, enabled by the CPU structure that has multiple cores, where multiple cores can run multiple applications at the same time.

Paralleism and Concurrency as defined by Oracle:
  • Parallelism - A condition that arises when at least two threads are executing simultaneously.
  • Concurrency - A condition that exists when at least two threads are making progress.  A more generalized form of parallelism that can include time-slicing as a form of virtual parallelism.

Concurrency and Parallelism in the Real World


Concurrency and Parallelism play a huge role in improving performance of high performance computing.  In today's computers, we have hit a physical wall that we are unable to efficiently reach faster speeds without causing heat issues in terms of today's popular silicon material that composes most, if not all computer chips.  In order to continue to move computing forward, computer science looked to using multiple processors that would run in parallel with one another.  It has been almost a decade since dual-core technology emerged as consumer-grade tech, and we compare this to modern times where our phones can pack quad and even 8 cores (Samsung announces 8 core mobile processor).  It will only become more important to be able to write multi-threaded applications in the future, as long we continue with the current technology trends.

Processes and Threads

Two key terms that you need to know before we go on:
  1. Processes - A self-contained execution environment that contains its own private memory space.  Many view processes as applications, but in actuality, applications can be made of multiple cooperating processes.
  2. Threads - Sometimes referred to as a "lightweight process", it somewhat models what a process is, but takes less resources than creating a new process.  All processes have at least one thread.  Threads share the process memory space.
The rest of this post will mainly deal with thread activity, but keep in mind that threads are part of a bigger programming concept called Processes.

Thread Methods

(Thread).start(Runnable r) - Your initiation method takes in a runnable or thread class.  Classes that extend runnable or thread must contain a run() method, for which will be the action of that thread.

(Thread).interrupt() - Sends this thread an interrupt, which will turn its interrupt thread flag to true.  Depending on how you implement its exception handling of InterruptedException, in try or catch, it could possibly catch it and continue on, or it could throw it.

Thread.sleep() - Puts the current thread into hibernation over a specified period of time.  Be careful, because this method may wake up during its specified period, or it may catch an interrupt.

(Thread).join() - Current thread will wait for specified thread to die until it continues.

Synchronization


Now I'm going to try and explain the concept of concurrency in terms on Java.  (Example found in Java Tutorials)

Counter Example

Say we have a class called "Counter", which has two methods, increment() and decrement().

public void increment() { counter++; }
public void decrement() {counter--; }

While counter++ and counter-- only take up one line of code, they actually consist of three different commands.

  1. Retrieve the value of counter.
  2. Increment/decrement the counter by 1
  3. Store the new value into counter
Now imagine we have two separate threads that share this same counter object.  Thread A will execute increment() and Thread B will execute decrement() at the same time.

Here is list of operations that each thread will execute.

Thread A Thread B
Thread A gets counter Thread B gets counter
Thread A increments retrieved value of counter Thread B decements retrieved value of counter
Thread A stores new value into counter Thread B stores new value into counter

Here is one example of what will happen to the object counter if it is not synchronized correctly:

Lets assume counter = 3

  1. Thread A gets counter value 3.
  2. Thread B gets counter value 3.
  3. Thread A increments retrieved value 3 to 4.
  4. Thread B decremented retrieved value from 3 to 2.
  5. Thread A stores new value 4 into counter.
  6. Thread B stores new value 2 into counter.
End result:  The new value of counter will be 2, and the action of Thread A will be ignored.  

Thread Interference and Race Condition

This is just one example of how concurrency can go all wrong if your code is not written correctly.  In programming terms, this is called "Thread Interference", when two threads are accessing one object at the same time.

This is also an example of "Race Condition", which occurs when two threads are trying to manipulate the same data, but due to the nature of how threads are accessed in an unspecified way, the result of the block of code is indeterminate.  If thread B had been called before Thread A, the same operation would have occurred in a different order, with the value of Thread A being stored and the value of Thread B being ignored.  

Bank Account Example

A quick real world example top help you understand how this might affect the real world, we have a bank account.  We'll create two methods:
  • deposit(int amount)
  • sendMoney(int amountToSend, BankAccount account)
Inside of a BankAccount, is a field called "balance".  What deposit does is takes an amount of money and adds it to "balance".  What sendMoney does is takes a bank account and adds money to their account and subtract that much money from your account.  Lets say you deposit $1500 dollars from your last paycheck, but at the same time your friend owed you for buying him lunch, so he sent $12.  If the code was not written correctly, you have a chance of losing you entire paycheck for a simple lunch tab.  Scary.

Memory Inconsistency Error

Memory inconsistency occurs when different threads have different views of what should be the same data (ie one thread may have a field at one value and another at a different value).  This Stackoverflow link helped me understand this topic a bit better.  Java Thread Memory Consistency Error - Stackoverflow

What we have here is seen as a "visibility problem".  In that what change one thread might do, may not be seen by a different thread.  

An example in the Stackoverflow discussion thread was:

x = y + z;
r = x + z;

In a single thread environment, these statements would act in sequence, and the value of r is determinate.  But if these two statements where separated by two different threads, r may very well be a different value because it may not see the change to x that was done in the other thread.  

What we need to solve this problem would be to create a happens-before relationship, that would establish that once x is changed by the first thread, the x in thread two would be updated to its new value.  

One way of solving this example would be to declare x to be volatile.  More on volatile here.  A quick summary of what volatile means is that variables that are volatile will never be cached to local thread memory, instead all read and writes go directly to main memory.  By doing this, we have created what is called a "memory barrier", which protects against memory inconsistencies.  Another way to do this is the use of a synchronized method or block.

This is a pretty long post, so I'll make this into two posts.  Here is a summary of the important concepts that we learned in this blog post.
  1. Concurrency and Parallelism are two concepts that represent Computer Science's approach to productivity and performance.
  2. Processes is an entity that contains its own private memory space, and contains one or more threads.
  3. Threads share memory from its process, but has its own run-time stack.
  4. Thread Methods - Start, Sleep, Interrupt, Join
  5. Thread interference and race condition create indeterminate code.
  6. Memory inconsistency is a problem that deals with having multiple values of the same variable between two different threads.
Thanks again for reading, and  hope you learned a little more about the complexities that modern day computing faces.

~James

Comments

Popular posts from this blog

Uncle Bob's Clean Architecture

C4 Model: Describing Software Architecture

Running RabbitMQ with Docker