MSSQL on Linux – the details
When it became known that Microsoft was going to port MSSQL to Linux, many sysadmins surely thought that it would end in catastrophe. Enterprise providers have not been brilliantly successful with Linux ports so far – trying to install (and configure) Oracle DB on a Linux server is one of the worst experiences known to an admin. With Microsoft, it looks about like this:
$ curl https://packages.microsoft.com/config/rhel/7/mssql-server.repo | sudo tee /etc/yum.repos.d/mssql-server.repo
$ sudo yum update
$ sudo yum install mssql-server
It’s astonishing but true – Microsoft really has figured out how to package software for Linux. \o/
But what does MSSQL for Linux look like up close? Engineers from Adfinis SyGroup risked a glance behind the curtains.
Under the hood
One of the main credos for porting MSSQL to Linux is that the run-time behaviour does not change as compared to Windows. Applications should not even notice if a tux finds its way into the backend.
Right away, this begs the question of how Microsoft is supposed to initiate such an undertaking. After all, there are always edge cases if you port a solution to a completely different platform. These edge cases are precisely what we want to exclude in a central enterprise application. After all, in a critical solution, even the tiniest regressions could cause tremendous damage. If we look at the many dependencies that MSSQL has, we can tell that an actual porting, incl. all libs and environments like CLR would be a monster of a task, difficult to justify, considering the amount of work required. Situations like this demand innovation. And we think that Microsoft has chosen a very creative solution.
One thing is clear straight away. Microsoft is using more or less the same executables and libraries in Linux as in Windows. The following overview shows how the whole thing works.
SQL OS
Starting with SQL Server 2005, Microsoft had already begun grouping operating system-dependant components in their own module, called SQLOS. At the time, the goal was not yet to support other platforms but rather to be able to prepare in-depth monitoring and analysis functions. However, as a bonus, this resulted in most access to the operating system becoming centralised. This served as a major advantage later on for porting to Linux. Although, this implementation was, of course, not yet designed to support different operating systems, and many Windows specifics were still not evident for the higher layers.
This layer allowed for the system calls to be ported centrally. But the other modules and libraries used by SQL servers would still not benefit from it. Luckily, another department within Microsoft had begun implementing an abstraction for Windows processes – Project Drawbridge. Originally, it was meant to simplify the virtualisation of Windows systems.
Drawbridge
Drawbridge is a research project by Microsoft, which began back in 2011. Essentially, Drawbridge was about reducing overhead in the virtualisation of Windows systems. In order to do this, Microsoft needed the following components, in a nutshell:
- Process isolation
- Layer, which represents the Windows ABI (kernel/syscalls)
- User space implementation of the Windows APIs/libs (win32 subsystem)
- Drivers on the guest system to initialise the memory for the VM
These ingredients already give us a relatively clear picture of what happens in Drawbridge when a VM is created. First off, the driver initiates the memory for the virtual machine on the guest system. In this step, a layer is loaded which covers the Windows ABI. This is accomplished with a user space NT kernel. In addition, a library is loaded in the memory of the VM which represents the Windows APIs. Both the user mode kernel and the library, which represents the many APIs, are part of LibOS. As soon as the VM is initiated with LibOS, Windows binaries can be executed in it, and Windows libraries can be loaded without the need for a syscall or API call to go out from the VM.
As far as we know, Drawbridge has never been integrated in a product. But it did demonstrates how “easy Windows virtualisation” could be done.
LibOS
As mentioned earlier, Drawbridge contains a complete user space implementation of the Windows system calls and the APIs needed in Windows. So Drawbridge implemented a type of user space version of the NT kernel (known as NTUM – “user mode NT”). Plus, LibOS has a library which represents the Windows APIs in the user space. Compared to the actual implementations in a normal Windows environment, both elements are kept much leaner, but they result in a run-time environment which does not behave any differently than a conventional Windows one.
SQL PAL
In combination, Drawbridge and SQLOS allow for MSSQL servers to be run in Linux. But in their implementation, they go way too far in some respects and partly solve the same problems. For this reason, the team at Microsoft decided to extract the relevant parts and use Drawbridge, LibOS and the SQLOS abstraction layer to construct a new platform abstraction layer (PAL). This one only contains the ABI and APIs relevant for SQL servers and features everything that’s required in order to provide a run-time environment for MSSQL. The bottom layer consists of a Linux binary (in the user space) which initiates the rest of the environment.
To-do list for SOS and LibOS
The current status is that MSSQL is already running smoothly in Linux, but the Microsoft team is not yet finished consolidating SOS and LibOS. So for the time being, there are two versions of SOS / LibOS within the SQL server for Linux. The “upper” version forwards calls to SQL PAL, which then uses the newer version of SOS to send queries to the Linux kernel via a host bridge. Microsoft is currently working intensively on eliminating the need for these two versions and harmonising these two instances so that, in the end, the entire abstraction can be performed in SQL PAL.
The final architecture at a glance
As already mentioned, Drawbridge and SQLOS have a few overlaps which Microsoft is still working on eliminating. Overall, SQL PAL comprises the run-time environment for MSSQL and the RPM/DEB packet brings all executables, as well the necessary additional Windows libraries along with it in order to be able to run MSSQL, based on SQL PAL. According to Microsoft, around 81MB uncompressed Windows libraries will be included in the supply (that’s just about 1% of a full Windows installation). The SQL PAL binary is about 8MB.
The software is installed in `/opt` and is managed with systemd. Microsoft supplies everything ready-to-use and with man pages included.
Generally, the process for starting MSSQL on Linux can be described as follows: the host component (standard Linux binary) first starts SQL PAL, which then initiates the MSSQL server in the “emulated” Windows environment.
The final architecture looks about like this:
Our readers will probably already have noticed that this approach is very similar to the [Windows Subsystem for Linux](https://blogs.msdn.microsoft.com/wsl/2016/04/22/windows-subsystem-for-linux-overview/).
Initial experiences
In collaboration with one of our customers, we are already working on the first migration of an MSSQL DB to Linux.
The initial experiences have been very good and demonstrate that the concept also works in practice.
To manage the DBs under Linux, we can use the CLI, as well as SSMS. Although the latter still has to be run in Windows for now. As a cross-platform alternative for folks who don’t use Windows, Microsoft offers an official mssql extension for Visual Studio Code, which can be run on Linux, Mac, or Windows. While it’s not a full-blown SSMS, it’s pretty capable and has IntelliSense support.
Limitations of MSSQL on Linux
Microsoft very often communicates the status of the current limitations of MSSQL on Linux and describes these limitations in detail in the release notes.
For example, the following are not yet available:
- Full text search
- Replication
- Active Directory authentication
For more detailed information, refer to the official release notes.
Next steps
We are planning a variety of other tests and, time permitting, we might cover one more topic in this blog. We’re particularly interested in clustering MSSQL under Linux, as well as the use of MSSQL in containers.
Conclusion
Kudos to Microsoft for this solution. What the engineers from Redmond did with the technical implementation left our techies in awe.
Plus, the approach they chose allows Microsoft to pave the road for porting other Windows services to Linux. Ultimately, SQL PAL could also inspire Exchange, AD, IIS, etc. In any case, we wouldn’t be surprised if Microsoft announced as early as 2017 that Exchange is now available for Linux.
Interested?
Interested in getting MSSQL running on Linux? We’d be happy to help you out with any migration projects and share the expertise we’ve accumulated. We’ll be glad to hear from you. You can reach us at info@adfinis.com.