The question of how to structure a machine learning project and its train and test phases has a lot to do with how we move through the ML “life cycle” and bring the program from a training environment to a production environment.
One of the simplest reasons to use the above model of putting ML training on a local machine and then moving execution to a server-based system is the benefit of essential separation of duties. In general, you want the training set to be isolated, so that you have a clear picture of where training starts and stops, and where testing begins. This KDNuggets article speaks on the principle in a coarse kind of way while also going through some of the other reasons to isolate training sets on a local machine. One other basic value proposition for this model is that, with the training and test sets on very different architectures, you'll never be confused about joint train/test allocation!
Another interesting benefit has to do with cybersecurity. Experts point out that if you have the initial train processes on a local machine, it doesn't have to be connected to the internet! This broadens security in a fundamental way, “incubating” the process until it hits the production world, where you then have to build adequate security into the server model.
In addition, some of these “isolated” models may help with problems like concept drift and hidden contexts – the principle of “non-stationality” warns developers that data does not “stay the same” over time (depending on what's being measured) and that it can take a lot of adaptability to make a test phase match a train phase. Or, in some cases, the train and test processes blend together, creating confusion.
Deploying the test phase on a server for the first time can facilitate various “black box” models where you fix the problem of data adaptability. In some cases, it eliminates the redundant process of putting change orders onto multiple platforms.
Then, also, the server environment obviously serves the real-time or dynamic processes in which engineers will want to access the data transfer and code models that work best for production in ML. For example, AWS Lambda may be an attractive option for handling the microfunctions of production (or a combination of Lambda and S3 object storage) and without connectivity (without a server) that becomes impossible.
These are some of the issues developers may think about when they consider how to partition training ML phases from testing and production.