Quantcast
Channel: Ubuntu Forums - Virtualisation
Viewing all articles
Browse latest Browse all 4211

[ubuntu] Virtual webserver suddenly down and LVM LV gives me i/o errors

$
0
0
Hello,
I run a Ubuntu 14.4.03 LTS server as host machine for a virtual webserver.

Summary of my system setup

Physical machine runs two SSDs with 250 GB and two HDDs from Western Digital (RED NAS Edition) with 3 TB.
The two SSDs run in software raid1 and the two HDDs also. On both raid arrays is a LVM installed, in sum two VGs SYSTEM and DATA.
While the local system (Ubuntu 14.4.03 LTS) is installed on the SSDs raid, the HDDs raid is used for file storage and also as LVM Group to store my virtual machines images.
The machine owns two network interfaces, one for the host and the other one isolated for the clients

Software in use

Host

On the host (Ubuntu server 14.4.03 LTS) runs stuff like Samba, Netatalk and Ufw – file sharing for the LAN side. And I installed lib-virt package with the virt-manager to set up my virtual clients.

(Virtual) Client
Runs a Ubuntu server minimal virtual headers (+ virtual extras) as system. Installed is Apache2 with modules like Php + stuff and also mod security. Also a CIFS-Client package to mount the file storage folder directly from the host. Well and several extra packages needed for our Php installations like image magicks, gdm, xml... and so on. Now on the virtual server within the Apache server are only two Php installations (web services): 1. OwnCloud and the 2. ResourceSpace, both based on PHP, MySql, HTML an CSS.

Additionally
For security reasons the virtual client has got his own isolated network interface right to the DMZ interface of my firewall and only port 80 and 445 are open.
The second interface is for the host to serve our internal network with samba and Apples **** Netatalk … also the host has a folder on the HDD where all company data and also all data from OwnCloud and Resourcespace get stored in one place. (only the data our services collect from the users, not the Php scripts itself, what get stored inside the virtual machine) – it's out of the reasons, that OwnCloud shares the same data externally (WAN) while we already share it in our internal network (LAN) with Samba …; therefore I created an direct network link using KVMs virtual bridge in the isolated mode hardened by firewall rules in the hosts UFW. So virtual clients can connect to the host over the Samba port only

Problem
Now this set up ran very good for over a few weeks now. But today and without any reasons the web server started to stuck. Nothing seems to help, so I performed a restart and recognised the main problem.

The web server is like already explained a virtual installation of Ubuntu server within the KVM system. Therefore it needs a place to store his data (images): This could be files like .raw or .qcow2 or as many tutorials recommend a LVM logical group where KVM automatically creates a logical volume within this volume than the data (virtual machine) gets stored in the raw format. Should give me better performance and you can use snapshots to backup a running machine


So far that sounds very difficult and so much info but now let's bring it to the point. I have a logical volume /dev/DATA/WEBSERVER on my hard disk and it is part of the volume group DATA. The logical volume worked very well the last weeks and now if I try lvscan or a similar operation I will get no access any more:

Code:

:/dev/DATA# lvscan
  /dev/DATA/WEBSERVER1: read failed after 0 of 4096 at 52428734464: Eingabe-/Ausgabefehler
  /dev/DATA/WEBSERVER1: read failed after 0 of 4096 at 52428791808: Eingabe-/Ausgabefehler
  /dev/DATA/WEBSERVER1: read failed after 0 of 4096 at 0: Eingabe-/Ausgabefehler
  /dev/DATA/WEBSERVER1: read failed after 0 of 4096 at 4096: Eingabe-/Ausgabefehler
  ACTIVE            '/dev/DATA/storage1' [2,44 TiB] inherit
  inactive Snapshot '/dev/DATA/WEBSERVER1' [19,53 GiB] inherit
  ACTIVE            '/dev/DATA/WEBSERVER1-bak_final_cloud_resspace' [48,83 GiB] inherit
  ACTIVE            '/dev/UBUNTU/system' [102,45 GiB] inherit
  ACTIVE            '/dev/UBUNTU/swap' [8,80 GiB] inherit

Eingabe-/Ausgabefehler is German and means Input/output error

As you see I'm not able to enter this logical volume WEBSERVER1 while all other logical volumes in the same volume group are still sane and run without problems.
I also checked the Raid with mdadm –> everything perfect.
Than I checked the smart status of all disk -> no errors, everything perfect.


So I only know that just on friend was using the Owncloud web service for some uploads yesterday (same thing as in the days before). All uploads get stored in the storage directory (/dev/DATA/storage1) outside of the virtual machine. But suddenly, he said, the server did no longer load our Owncloud website and that was the minute my LVMs logical volume got inaccessible. By the way, he has no system user privileges or even entrance so no way, nobody did anything to cause this directly.

Before I load a two week old backup now I would like to find out what has happened with my volume because I would like to avoid the same situation in a few weeks again

I have no clue what may has happened or caused this. Just one suspicion: KVM got a set up to reserve 50 GiB for the webserver1 within the volume group as logical volume if needed but it should only increase the storage if the data increases over 20 GiB, because I just allocated 20 GiB fix for the beginning what normally should be enough aside this is a normal function of KVM and other visualisation systems too. ?P
Might this be a reasons for breaking my volume. Also I do not understand why there are existing some snapshot entries relating to webserver1 within the volume groups config file what nobody has created before for webserver1 and what also are not existing on the disk?

Any Ideas how I could fix this issue or how I could avoid such failure in the future. Is it save to use LVM for virtualisation or should I switch back to regular files like qcow2?

thank you very much

Viewing all articles
Browse latest Browse all 4211

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>