Modeling a Server Farm

The main purpose of this project is to use our Queue structure to model job division in a multi-server (or processor) setting. Typically, the model has jobs arriving over time and upon arrival must immediately be assigned to a server for processing. Each server maintains its own queue of jobs for processing and operates under the rule that it must process jobs in the order of their arrival.


Tasks

Familiarize yourself with the Job class

We have provided you with the Job class you will need for this project. You do not need to change any of the code in the Job class, but you should familiarize yourself with its methods as you will need to use them. The Job class has the following useful methods implemented. You can expect to use all of them by the end of the project.

The Server class

The first class you will write for this project is a Server class. An individual Server needs to manage a Queue of Jobs, allowing for jobs to be processed in a FIFO (first-in-first-out) manner. Additionally, the Server needs to track system time (i.e. the time on the clock) and some summary statistics about the jobs it is processing that will be needed to analyze the simulation.

You must maintain a Queue of jobs using your LinkedList class that implements the Queue interface. You should also maintain several fields for accounting, including a double named time to track system time, a double named totalWaitingTime to track the total time jobs spent between arriving at the server farm and being finished. Maintain another double field named remainingTime that tracks the remaining amount of processing time needed for all jobs still in the queue. Finally, maintain an int field called numJobs to track the number of jobs processed so far by the server. All of these fields can start at 0 at the beginning of the simulation.

Implement the following methods for the Server class:

processTo Method Explanation

The processTo method should start by computing how much time it should spend processing jobs--i.e. the difference between the current system time, and the time passed into processTo. We'll call this timeLeft. Your processTo method should process individual jobs in the Queue, one at a time, in a FIFO order, while timeLeft is greater than 0. Your Server cannot spend more time processing an individual job than the minimum of the job's remaining processing time, and timeLeft. When this method finishes processing a job, it should remove it from the queue, add the job's waiting time to the total waiting time for all jobs processed by the server, and incrememnt the number of jobs processed by the server. If there is still timeLeft (i.e. it's greater than 0), the server should continue processing jobs. Remember to update the system time at the end.

Take a look at these slides for a visual of how time is handled in this simulation.

Picking the "Right" Server for the Job

Now we have a Server class that is ready to accept and process jobs for us. The next step in this project is to build a JobDispatcher class that will allow us to create a collection of K Servers and allocate incoming jobs to the most appropriate Server.

With 5, or 10, or 100 different Servers to choose, how should decide where to send jobs when they come in? There are multiple reasonable answers to this question and we will implement and compare several. Specifically, we will implement a RandomDispatcher that chooses a Server at random; a RoundRobinDispatcher that cycles through the Servers in order; a ShortestQueueDispatcher that chooses the Server wth the fewest jobs in its Queue; and a LeastWorkDispatcher that picks the server with the least remaining processing time across the jobs in its Queue.

However all of these types of JobDispatchers share common methods, so instead of rewriting these four times, we will make an abstract class to work as the framework for how each JobDispatcher will function.

An Abstract JobDispatcher class

Make an abstract JobDispatcher class: public abstract class JobDispatcher. A JobDispatcher should manage a collection of Servers (implemented as an ArrayList). In order to let your ArrayList of servers be accessed by the subclasses of our abstract JobDispatcher class, make its access modifier protected.

Other class fields your JobDispatcher should maintain include a double for system time, a ServerFarmViz object to use for visualizing the simulation, and an int to track the number of jobs handled so far.

When jobs arrive, your JobDispatcher will advance the system time to their arrival, which requires all Servers to process jobs in their respective queues up to that arrival time. The JobDispatcher will then pick which Server to send the job to, but we'll leave that method abstract for now: public abstract Server pickServer(Job j).

Below are the methods to implement for the JobDispatcher class:

Implementing Specific Dispatchers

Now that we've built the framework, we're ready to make a few different dispatchers. Each one will extend the JobDispatcher class, which means we'll inherit the methods we've defined so far for free and only need to implement the constructor and the pickServer method. N.B.: when you write the constructors for each class make sure to call the parent's constructor via the super method.


Exploration

Now that we have our dispatchers, we're ready to compare them! We will also look at what happens to wait time as you change the number of servers in the server farm. (Note, make sure to set showViz to false or you'll be waiting forever!).

Modify the ServerFarmSimulation we gave you to run experiments to answer the following questions. We must be able to replicate your results by running this file! That means that your ServerFarmSimulation needs to run all of your experiments, and nicely format the results before printing them so that we can tell what we're looking at!

Required result 1:Make a table comparing the average waiting time of a job for the 4 job dispatchers with 34 servers and 10000000 jobs. Keep meanArrivalTime set to 3 and meanProcessingTime set to 100. Which one(s) works best? Which one(s) worst? Is that what you expected?

Required result 2:Make a graph showing the average waiting time of a job for the shortestQueue dispatcher as you increase the number of servers from 30 to 40. As before, run the simulation for 10000000 jobs. Keep meanArrivalTime set to 3 and meanProcessingTime set to 100. How many servers do you need to not fall too far behind? Does this match your expectations? (For an extension, you can run some follow up experiments and see if you can make a guess at how to answer this questions mathematically.)


Reflection Questions

Please answer the following 2 questions. Put any written responses numbered in the Reflection section of your report, and include any code with your other code files. This section is worth a substantial portion of your project grade, and generally includes the type of more conceptual exercise you can expect to see on exams, so it's good preparation for that!

Question 1: One Queue or Many Queues?

The goal is to analyze the following question without having to write any code: what's better: each Server maintaining its own Queue, or a singular Queue that Servers poll from?

The first case is what you've already programmed for this project. The latter would instead have each Job sitting in one (longer) Queue, from which Servers poll from whenever they aren't actively working on a job.

It turns out, for the LeastWorkDispatcher (which you probably found to be the best dispatcher) is exactly the same for these two models! Try writing down some small examples and observe it on your own - then in your report write down an argument as to why they are the same, as if you were trying to explain it to your fellow classmates.

Question 2: Breaking the Rules

What if Servers didn't have to work on jobs in the order that they arrived? What if Servers could work on any job that had been assigned to them?

In particular, what if a Server always prioritized whatever Server in its queue had the least remaining work?

  1. Add the following methods to your LinkedList:
    • E findMin(Comparator<E>comparator): this method needs to use the given Comparator to find the minimal item in the list.
    • E removeMin(Comparator<E>comparator): this method removes the minimal item from the list.

    In both of these methods, you can use comparator's compare method to figure out which of 2 Jobs passed into it should be prioritized (lower processing time means higher priority). Check out the template for the Comparator below, and the documentation here for more details about how to use the compare method, and how to interpret its output. When writing the method, you can assume the comparator exists. You will need to construct it in the following steps when calling these methods.

  2. Create a new PreemptiveServer class. Copy-paste everything from your regular Server class into it.
  3. The only thing we'll change is the PreemptiveServer's processTo method. Update it so that always works on the job of minimal remaining processing time.
  4. When you call findMin and removeMin in your updated process to method, you will need to make a useful comparator and pass it in as an argument. You can do this using the following code as a template. We will use this in future projects, and spend some time unpacking what this chunk of code does then, but the short answer is that you're making an anonymous class that implements the Comparator interface, and overwriting the compare method so that it orders 2 jobs the way we want it to (based on which job has less processing time).
    		Job j = removeMin(new Comparator<Job>() {
    
            		@Override
            		public int compare(Job o1, Job o2) {
    				//Write some code that returns 0 if the jobs have the same remaining processing time, returns -1 if o1 has a longer processing time than o2, and 1 otherwise.
            		}
    
        		});
    		

  5. Update your tests used for the exploration section to instead use these Preemptive servers. Do you see any improvement in the average wait time of jobs? Detail and discuss your results - consider whether they match your expectations.

Extensions

The following are some suggested extensions. You should feel free to pursue your own custom extensions to your project that you find interesting. Please clearly identify your extensions in your report. In your code, make sure the baseline simulations run properly. Making a separate sub-folder for major extensions is recommended.

  1. Explore different dispatch methods. Can you beat whichever method you determined to be the 'best' in the exploration section?
  2. Explore different objectives. For example, instead of minimizing average time in the system, instead minimize the average ratio of job size to its wait time.
  3. In our project so far, all Servers have been the same. If some of my servers were faster than others, how should I use them?
  4. Given a job sequence where on average jobs arrive every x seconds and take y total units of processing time, what seems like a minimal number of Servers necessary to handle those jobs without falling uncontrollably far behind? Does your best dispatcher support this intuition?

Congrats, you're done with everything except for your project report where you will communicate your findings from all that code! Make sure to write up your report in the project report file in the Google Classrooms assignment. The project report is an important part of the work in this project, so don't forget to do it!