Setting up a server for a research team. What should be in my checklist?
from bergetfew@sopuli.xyz to selfhosted@lemmy.world on 12 Aug 20:15
https://sopuli.xyz/post/31947673
from bergetfew@sopuli.xyz to selfhosted@lemmy.world on 12 Aug 20:15
https://sopuli.xyz/post/31947673
I’ve been asked to set up a server for a research team at my university. I’ve already had the practice of setting a server at home, so I have a rough idea of how things should be done. Still, I wish to follow best practices when setting up a server for this use case. Plus I would prefer to avoid too much tinkering for the setup since I’m planning to keep the installation as simple as possible.
Following are some rough constraints and considerations for the setup:
- Server computer is a Mac Mini (latest model I think?). I’ve been told they would replace macOS with Linux, still I believe I should ready if they don’t (I don’t have experience with macOS at all)
- Server will be situated in university and provided a static IP address
- Team needs remote access to the server, presumably comfortable with using CLI
- I am unlikely to be permitted access to server myself after setup, so it should be ready to be managed by the team
- Extra hardware and/or paid software could be arranged but to a limited extent and within reason
I don’t think they have really any requirement other than having remote access to the server. I think SSH should suffice, however I was wondering if I could also arrange for backups, GUI server panel etc.
#selfhosted
threaded - newest
Step one is check with the university IT department. Don’t put random unmanageable shit on other people’s networks.
Why a Mac running Linux? I can’t think of a use case for that.
The server should be no problem to the university as long as it’s set up to do what I was told it would do.
Is it okay to use macOS too? I thought Linux was more prevalent among servers. Although if there is no significant change in operating one compared to Linux, then I’ll just stick to macOS after all.
It shouldn’t be significantly different from Linux, except maybe in what would be required to truly lock it down for a server. Assuming you’ll be installing normal server side stuff, all that should be nearly identical, especially if you use Docker or other containers/vm/interpreter based things.
Though it’s been yeeaaaars since I’ve touched a mac, so someone far more experienced will have to chime in if I’m off base!
The biggest pain is that its apple silicon, which limits what’s built for it.
Depending on what its doing, you may need to build dependencies from source. You also may have to use some mac-specific tools to download pre-builts and to do builds.
In some cases, it may be better to run a VM with Linux and forge on ahead, but its really going to depend on what tasks the server is handling.
It depends on what you’re trying to do with it. Typically people only use Macs as servers when they’re doing development for Apple products.
Can you ask to use the universities Identity Management system? That way only current students and faculty will have access.
I was told the some team members work in different universities, so we would need to accommodate them as well
PITA, but if this is a small # of uni, you could have separate realms per each.
And why can't university IT set up the server? No offense but you're a nobody asking us, also nobodies, how to set up some sort of a funky server on the university network, meanwhile the university pays people to do this for a living.
Where will the server actually be? Will it be in a secure location where only authorized personnel can physically access the machine or will it be behind the trash can in the cafeteria where anyone can access it?
Since you will lose access to it once it's set up who will monitor the system? Who turns it on in case it somehow gets shut down? Who sets up backups and does rollbacks if something breaks?
What happens to the hardware when research project is over?
To me it all smells like something the IT department should set up. They already know the best practices. They also know whatever security guidelines they need to follow. They will have monitoring systems in place so they could admin the system instead of leaving it without an actual administrator. And they're probably the ones decommissioning the hardware when the research project is over.
My suggestion is to leave it to the people who are getting paid to do this. It's one thing to know how to set up a home server on your home network, it's a different thing to set up a server on an enterprise network.
Mostly it boils down to laziness. They for sure have the ability to set up the server themselves but they can’t be bothered to unless it’s for a larger number of machines. They have essentially given a thumbs up to proceed with the setup but haven’t offered assistance themselves. I think the team might already have reached out to them, but were let down which is why they tried to contact me.
The server will be stored in the personal office of one of the members of the team. It should be physically secure.
I don’t think I would completely lose access to it, rather it’s just that I won’t be allowed to personally SSH to the server with my own devices. I may still be able to connect to it through one of the members’ devices or onsite. The team member earlier mentioned will take care of the system after the setup.
I don’t know what’ll happen to the server after the project is over, nor am I in a position to assume something.
What you describe is a workstation under someone’s desk. Usually when you connect to your campus vpn you should be on the network and be able to reach most things.
The problem is who is gonna manage it when you are gone? I’ve had teams come and ask to get their trash can Mac’s rescued because whoever managed them left the team. How are you going to do backups? Are you gonna put a NAS next to it? Or actually use the tape drives that ITS can provide?
Ask your local research computing group what this team needs to do to host or contribute an actual server. A Mac mini is a consumer product, research grants can I include hardware, your computing group has actual racks and people who know how to manage it.
Why do you want to help lazy people? You are most certainly going to regret getting yourself into this.
Have you personally asked the IT department about this? I would be concerned that they were told “no” by IT, so they are asking you to do it behind their back.
probably because they already own it.
Will they need to install new software after you set it up, or just have user storage and maybe do system updates?
What will they be doing with it?
Do you have a backup storage location available?
How many users?
What kind of permissions do they want various users to have?
How critical is the data that will be housed on the server?
Sorry if I am unable to provide specific details for the queries. I don’t have answers to most of them myself which is why I was hoping what the safest bet for these situations would be to implement.
Highly likely they would be installing new software
I don’t know much about its use case, although it won’t be too intensive since they probably have a separate machine for heavier work.
Backup storage option wasn’t proposed at all. I’m thinking of proposing to implement one.
I expect between 10-20 users.
User permissions requirements wasn’t discussed as well, although I wouldn’t expect there to be any need to grant everyone admin privileges
Don’t know about the criticality of data. I could only speculate to be considerable by default.
Backup is step one, or even step 0, of setting up a server. The amount of frustration and even job loss a backup can prevent is always worth the expense of time/money.
Backup can be setup scripts/config files/automation if the data doesnt matter, but you do need it. Also, even if they say the data doesn't matter, the data almost always matters. It may not now, but it will in 3 years when people use the server for real work and everyone just doesnt even begin to think about a backup until the server fails one day and they lose years worth of their grant and thesis data.
Backups can be simple, they can be complex. They can be free or pay, they can have gui or just be scripts. Settle on one that you can make work, and CHECK THEM OCCASIONALLY with test restores of at least a few files. If you dont test and find a working backup, you have hope, not resiliency.
Could you suggest what would be the most appropriate backup solution in this case? I could also ask them to arrange a backup drive or a cloud provider if needed.
Depends on what you're doing a bit. Databases? Hypervisors? Just files? If all of the above, its best to use an actual product this. Either foss like borgbackup or Urbackup, or something like Veeam which is a popular pay option.
If its a proxmox hypervisor, they have their own free backup appliance, but you need a second physical server to run it on.
If it's just databases, most have a built in way to take a backup. Just google the name and backup. Make sure it's running automatically and is moved to a separate server on each run.
For files, rsync is a great option.
This basically means that the system will rot over time and will need to have someone who knows what they’re doing to maintain it. If they don’t know enough to do the initial setup, then I would worry about how quickly it would go awry after you no longer have access. given the number of users and the assumed criticality of the data, I would have a long conversation about what can happen and what their plans are
I personally would not do this. There are so many red flags.
I know this isn’t exactly what you’re asking for, but I’d recommend also looking into a VM OS such as proxmox or unraid (I’m running unraid)
They’ll let you create/destroy VM instances you can access remotely. So in theory, you can give everyone their own VM to use and access the files on the server.
However, unraid / proxmox may have performance issues running in a VM on a Mac mini…
What do you need (temporary) VMs for?
I was thinking OP could give everyone their own VM to use as a workstation so they could access the files on the server easily, and/or run programs based on their work. When their coworkers leave, OP can easily destroy the VM and the resources would be automatically reallocated (depending on the servers configuration). With a physical device, the storage on that device is only allocated to that device and can’t be shared when it’s not in use
Me, personally? I have multiple VMs for different contexts: my teaching job (super clean, video sharing tools, presentation tools), gaming, media server (has scripts to download stuff off of YouTube), server management (just a regular Debian install), and a fuck around box (I just use it to try new OSs like Fedora, or try breaking OSs like deleting the system32 folder on windows)
Ok, good reasons. I would’ve thougut about vacode, rstudio server, et al so that you really only have a server. I hate not having a sound card on a remote windows server
Unreal tournament
You could tell him he needs RAM and a CPU too but they sound like they are at a skill level where they have a grasp on first principals and essential basics.
I think you really need to talk to the research team to find out what they want to use the server for, and how they want to collaborate. That will inform everything else.
Part of the reason why they left so many details vague was to give me some freedom on what to setup in the server based on what I think is right, although I do agree there needs to be clarification for some points.
Could you give me a hint on what I should additionally ask regarding their server needs?
I guess as a starting point most of us in this thread don’t really know what university research teams do.
If they had a laptop or phone, what kinds of things would they want to do that requires a server? Will they need email? Instant messaging? File sharing? Document collaboration? Will there be sensitive information? Do they need specific software? Or put another way, without this server, what can’t they do?
If you can give some hints on that kind of stuff, I’m sure people in this thread can help out more with specifics on software/tech recommendations.
Edit: obviously Unreal Tournament is non-negotiable.
I wasn’t able to get a clear response but I can say that they are primarily going to use it for writing and storing code like a Github repo, plus installation of 2-3 programs whose names I couldn’t recognise.
They could use Github itself, but I know they know this too so but deliberate chose to work this way. I could probably suggest a software like Gitea or Forgejo for this purpose, but I suppose they aren’t in need of that.
Github doesn’t need a server. It is a cloud service.
Why can’t they run the programs locally?
This makes sense. Sometimes its better to run ‘helper’ programs in a remote container so configs and such are synced.
If they are trying to setup an inference server for cursor or something, though, you will need to run OSX. Linux does not support Metal acceleration (last I checked).
I think step 1 would be to see if they need a server.
If they really didn’t provide you any more information than what you mentioned in the post and comments and you won’t even be permitted access to maintain the server, I wouldn’t complicate too much. Even if you could do more, you’d be guessing, and probably make life harder for the researchers who might not have the expertise having to actually maintain something too complex.
Do the bare minimum to make it functional and overall secure, make sure the operating system works, get SSH access configured for as few people as you can get away with, and make sure updates are installed automatically. They should be responsible for everything else and you should make that clear to them (backups, software, etc)
Provide notes on what you did to the future owners of the server and maintenance instructions as well.
If you are part of an IT team in the university, and if you have some leverage on it, make sure you have the authority to handle things on an emergency (like having the right to pull the plug if the server becomes rogue or misbehaves somehow). Also look to see if you can push them to a more standardized alternative, if your IT team provides standard services look to see if their use case can be fulfilled somehow by them, even partially. I know a lot of universities provide code forges and job submission clusters students and teachers can use, maybe their use case fits these.
use some file system with snapshots and differential backups, like ZFS, and snapshot it daily. Stream the diffs somewhere they can’t login to and which doesn’t mount the FS.
This will invariably save their bacon at some point.
Although I suppose this could be done in a vm, it’s an otherwise unlikely combo, since this will be an m series Mac and proxmox has no compatible native arm64 version.
Oof. Do you have any experience doing stuff like this? If you don’t, I don’t think you should take this job. If you know what you’re doing however, then I don’t understand many of your questions. What is the DHCP and firewall situation at the Uni? What is your backup solution? Why will you not have an access key after setup? If there is another team also managing it? What do they think?
In any case, I would NOT use a Mac as a server. You can run Asahi or such on it, but many of our idioms just don’t work on Mac.
If it’s from a Uni and power consumption and Noise don’t matter, I would buy (consciously) three used 1u servers and cluster two proxmox nodes. On the third I would run a proxmox backup server. If money also doesn’t matter, I would do the same, but buy new.
However, you may not even need proxmox, but the issue is that you don’t even know what they are going to use the server for. This makes it impossible for us to give you good suggestions.
I was proposed to set up the server knowing that I have limited knowledge on managing stuff like this. They already have an sysadmin in the campus, but I think their setup is simple enough that they were willing to approach me. Besides I do consider myself to be experienced enough to work my way around CLI and troubleshoot issues even if I haven’t had experience with hardware like this.
I don’t think negotiating for a different computer would be possible. The main challenge would be to make best of the hardware I’m provided, with additional peripherals if needed.
I do plan on asking them the nature of the work to be done on the server, but I wouldn’t expect it to be too niche or computationally intensive since they have separate computers for that. In any case, I will relay the points highlighted in this thread to them and get a clear idea of what is needed to be arranged.
How big is the university?
Im confused, we had servers at our school, it can be whatever thats accessible over ssh, but should also only be accessible through school vpn/network, guess thatd be the default if its stored at the school. Maybe make sure it has wake on lan for convenience. Btrfs snapshot for easy reversals.
Usually you ask IT department to spin up a vm for you. They will take care of security and backups.
I guarantee that if your university IT department knows nothing of this they will not appreciate a shadow IT device set up on their network when they find it. And they WILL find it. Reach out to your school’s IT team and make sure your research team has approval to do this before proceeding
Source: worked in Higher Ed IT
They already have gotten the permission for this.
However the IT department wouldn’t be helping with enabling public access to the computer via the university’s intranet. So it is up to me to figure out an alternative connection strategy like tunneling or VPN.
Perhaps consider a SOCKS5 reverse proxy. If done over SSH, the client systems networking would act as though they are on the server itself, traffic would be secure, and it would walk around most firewall rules that probably exist.
Using key based authentication would also make it such that it is more secure and easier for the researchers to log in - they wouldn’t even have to remember a password, they would just need some SSH client/configs.
Specifically, read up on “bastion hosts”.
I might double check if you’re taking the research team’s word for it. It’s very strange to me that IT won’t help with setting up remote access, especially if you’re doing something like setting up a tunnel that would likely bypass their firewall rules and network monitoring. If anyone compromises your tunnel and are able to access education, financial, or health records with it, your IT team and you will be in very hot water. I’ve had to set up remote access for research projects before, and that involved site-to-site tunnels between participating campuses. These always run on the university’s VPN infrastructure, you’re not going to be rolling your own talescale or wireguard to do this because you want to be able to inspect the traffic flowing over it. You’ll also need to tie into the university’s identity management platform to make sure that only authorized users have access to the server. Whatever you decide to to, at least deliver a copy of your documentation, because once you’re out of the picture, it’s going to be up to campus IT to maintain it.
Honestly, after considering the security implications of enabling access to the university’s network, I think I would first warn the team about this before setting up anything and let them decide how to proceed afterwards. I’ll also inform them to ask the IT department for the in-house VPN solution and identity management.
I don’t believe there would be need for the team to access anything in the network apart from the computer itself. Is it possible to arrange a solution that disables connections to intranet devices through the server by default just to be safe?
Sounds good! It’s always a good idea to make sure everyone is on the same page about the risks involved in a project. I must also stress that you reach out to IT personally and make sure they know what’s going on. We’ve definitely had faculty go behind our back before, I’m sure that happens at other schools too, lol. I do have some final thoughts based on your bullet points to consider:
1) Server computer is a Mac Mini (latest model I think?). I’ve been told they would replace macOS with Linux, still I believe I should ready if they don’t (I don’t have experience with macOS at all)
I would personally avoid this, just due to the fact that the hardware might be in an insecure area and easily forgotten. Once the team is done with it, they might just leave it in place to idle forever. Or, someone might find it, not know what it is, and remove it. A VM running in the campus datacenter is a lot safer than a computer under a lab counter. This also avoids future hiccups with an exotic OS/hardware combination that you might not be thinking about right now.
*2) Server will be situated in university and provided a static IP address *
You’re going to need IT’s involvement on this, full stop. Just because an address isn’t in use right now on the campus network, doesn’t mean that it’s free for you to use. As far as public facing IPs go, those are totally managed by campus IT. You will need their permission to host services on them.
3) Team needs remote access to the server, presumably comfortable with using CLI
See number 2. You need IT’s permission to host services on the network.
4) I am unlikely to be permitted access to server myself after setup, so it should be ready to be managed by the team
You need to document everything you set up ready to hand off to whomever will be managing this server. This is unlikely to be someone on the research team, as if they can’t set it up themselves, I doubt they’ll be able to maintain it.
5) Extra hardware and/or paid software could be arranged but to a limited extent and within reason
There’s generally going to be a budget and/or grants available depending on the scope of the work the research team is doing. Going through the proper channels (IT Dept and Grant Officer) will give you access to these resources. As an example, a center involved with our school was doing watershed research and we were able to secure them a grant for tablets they could use for fieldwork.
One final, important thing to consider is what data this research team is collecting, what they’re using it for, and what’s going to happen to it once they’re done with it. If there is any PII (Personally Identifiable Information) in this dataset, there might be laws (e.g. GDPR) that you might need to comply with.
I hope that you don’t think that I’m harping on you too bad, it’s just that there’s lots to consider outside of the raw technical side of things that many homelabbers don’t really have to think about. I do admire your willingness to take on a project like this! Hopefully your IT department aren’t sticks in the mud and would be willing to help out on this. If they had any sense they would as it prevents users trying to DIY things in the first place.
Huh.
There’s a time and place for a DIY solution and academia can well be like that sometimes.
The latest Mac Mini can’t run Linux though. It’s M4 and asahi doesn’t even support M3 chips yet. But if you actually got the previous model with M1/M2 you can do Linux if desired. I might not attempt, and just use the Mac as a server as-is. It’s not too different from Linux. Asking the duck for “how to xx on Mac” when you already know the Linux equivalents should make your life tolerable.