We will have a scheduled maintenance window tonight from 7:00pm to 11:00pm PST. The following changes will take place:
- 7:00pm – Upgrade the server ‘mradm01′ from RHEL4 to RHEL5 (bug 542454). Actual downtime is expected to be 30 minutes or less, but it may take through the end of the window to get all services completely functional again afterwards. This server hosts ns1.mozilla.org (DNS server) and the nagios slave (facility-wide system monitoring) for the MPT colo facility. DNS services are redundant (ns2 and ns3 will pick up the slack). We will be without monitoring in the MPT colo facility while the server is being upgraded, however. This means any problems that are normally automatically detected and cause the oncall sysadmin to get paged will not be detected or paged about during the outage. We ask that if anyone observes any such problems during the window that you report them in the #bmo channel on irc.mozilla.org (click the link to go there via your browser).
Please let me know if you have any reason why we should not proceed with this planned maintenance. As always, we aim to keep downtime to as little as possible, but unexpected complications can arise causing longer downtime periods than expected. All systems should be operational by the end of the maintenance window.
Feel free to comment directly if you see issues past the planned downtime.
We will have a scheduled maintenance window tonight from 8:00pm to 11:00pm PST. The following changes will take place:
Please let me know if you have any reason why we should not proceed with this planned maintenance. As always, we aim to keep downtime to as little as possible, but unexpected complications can arise causing longer downtime periods than expected. All systems should be operational by the end of the maintenance window.
Feel free to comment directly if you see issues past the planned downtime.
We will have a scheduled maintenance window tonight from 7:00pm to 11:00pm PST (0300 – 0700 UTC). The following changes will take place:
- 7:00pm – Breakpad maintenance – We will be upgrading crash-stats.mozilla.org and doing maintenance on the Breakpad database to improve future performance. crash-stats.mozilla.org will be down for around 4 hours. No crash reports will be processed during this window. We will still be accepting and storing incoming crashes. (bug 542390)
Please let me know if you have any reason why we should not proceed with this planned maintenance. As always, we aim to keep downtime to as little as possible, but unexpected complications can arise causing longer downtime periods than expected. All systems should be operational by the end of the maintenance window.
Feel free to comment directly if you see issues past the planned downtime.
We will have a scheduled maintenance window tonight from 7:00pm to 11:00pm PST (0300 – 0700 UTC). The following changes will take place:
Please let me know if you have any reason why we should not proceed with this planned maintenance. As always, we aim to keep downtime to as little as possible, but unexpected complications can arise causing longer downtime periods than expected. All systems should be operational by the end of the maintenance window.
Feel free to comment directly if you see issues past the planned downtime.
We will have a scheduled maintenance window tonight from 5:00pm to 11:00pm PST. The following changes will take place:
Please let me know if you have any reason why we should not proceed with this planned maintenance. As always, we aim to keep downtime to as little as possible, but unexpected complications can arise causing longer downtime periods than expected. All systems should be operational by the end of the maintenance window.
Feel free to comment directly if you see issues past the planned downtime.
(Update)
We will have a scheduled maintenance window tonight from 9:00pm to 11:00pm PST. The following changes will take place:
- 7:00pm PDT (0300 UTC)
crash-stats.mozilla.com & crash-reports.mozilla.com content push. We’ll be picking up code updates (bug 538261). Duration one hour.
- 9:00pm PST (0500 UTC) getpersonas.com update. We’ll be updating
getpersonas.com to pick up code updates (bug 537956). Duration 30 minutes.
Please let me know if you have any reason why we should not proceed with this planned maintenance. As always, we aim to keep downtime to as little as possible, but unexpected complications can arise causing longer downtime periods than expected. All systems should be operational by the end of the maintenance window.
Feel free to comment directly if you see issues past the planned downtime.
We’ll be taking advantage of the end-of-holiday lull and post-Fennec 1.0rc1 / pre-Firefox 3.6 releases to perform maintenance on some Production & Release Engineering infrastructure.
During this window, all check-in trees will be closed.
We’ll be performing maintenance Sunday from 12:00pm to 6:00pm PST . The following changes will take place:
- 12:00pm PST (2000 GMT) Duration 2 hours: NetApp OS upgrade. We’ll be updating the two Network Appliance file servers that handle a number of Release Engineering VMs. See bug 531233 for more details.
- 12:00pm PST (2000 GMT) Duration 2 hours:
b01 database cluster maintenance. We’ll be upgrading the firmware on the b01 database cluster (bug 535859) and performing maintenance to clear up disk space (bug 528573). This work will affect the following services:
- 1:00pm PST (2100 GMT) Duration 4 hours: ESX hardware upgrade. We’ll be upgrading the production ESX cluster from old dual CPU HP DL385 servers to our standard 8-core/32GB RAM HP BL460 platform. Since this upgrade involves switching CPU platforms (AMD to Intel), it requires each VM to be shutdown, migrated and powered up. Each VM’s downtime will be not unlike a normal reboot.The following list includes notable VMs & services that will be briefly affected:
Please let me know if you have any reason why we should not proceed with this planned maintenance. As always, we aim to keep downtime to as little as possible, but unexpected complications can arise causing longer downtime periods than expected. All systems should be operational by the end of the maintenance window.
Feel free to comment directly if you see issues past the planned downtime.
We will have a scheduled maintenance window tonight from 6:00pm to 12:00am PST. The following changes will take place:
- 7:00pm PST (0300 UTC) Geo-location support for bouncer/
download.mozilla.org. We’ll be turning on geo-location support to download.mozilla.org tonight. This should improve the download experience by directing users to mirrors that are geographically close to their origin IP. See bug 459919 for more details. No downtime expected.
- 7:00pm PST (0300 UTC) PHP 5.2 upgrades. We’ll be rolling out PHP 5.2 upgrades on the production web servers (bug 533486). We don’t expect any user-facing downtime. Duration 3 hours.
Please let me know if you have any reason why we should not proceed with this planned maintenance. As always, we aim to keep downtime to as little as possible, but unexpected complications can arise causing longer downtime periods than expected. All systems should be operational by the end of the maintenance window.
Feel free to comment directly if you see issues past the planned downtime.
We will have a scheduled maintenance window tonight from 5:00pm to 11:00pm PST. The following changes will take place:
- 6:00pm PST (0200 UTC) Kernel upgrades. We’ll be doing kernel security updates on various machines affecting the following services for approximately 10 minutes each:
- *ALL* sm-* (development) servers.
- All IT-administered build services, including build.mozilla.org, try server, cruncher, AUS staging, and the pageload/graph servers
- All VPNs and jumphosts in the offices and colos (except Beijing)
- All office and colo DNS and DHCP servers (except Beijing)
- All Web-dev and app staging servers
support.mozilla.com (chat services and knowledge base)
- All version control services and related utility services (CVS, CVS-mirror, Hg, SVN, Bonsai, HgWeb, ViewVC, Tinderbox, MXR)
ftp.mozilla.org
stage.mozilla.org
people.mozilla.com
intranet.mozilla.org
quality.mozilla.org
litmus.mozilla.org
irc.mozilla.org (multiple servers, will fail over via DNS to keep it up)
addons.mozilla.org
download.mozilla.org
videos.mozilla.org
nagios.mozilla.org
bugzilla.mozilla.org
developer.mozilla.org
spreadfirefox.com
- Internet-facing mail relays, list servers, and outbound mail (mail will be queued)
- All public and private wikis and blogs administered by Mozilla
Some services may have more than one outage as various back-end components each get their upgrades (for example, a web application will go down once when the web servers are upgraded and again when its database server is upgraded).
- 7:00pm PST (0300 UTC) LDAP upgrade. We’ll be upgrading OpenLDAP tonight. Our LDAP infrastructure is redundant and there shouldn’t be any downtime noticed. Duration 30 minutes.
- 9:00pm PST (0500 UTC) Zeus TM configuration changes. We’ll be turning some SSL cache settings on the production Zeus cluster (bug 533809). No downtime is expected.
Please let me know if you have any reason why we should not proceed with this planned maintenance. As always, we aim to keep downtime to as little as possible, but unexpected complications can arise causing longer downtime periods than expected. All systems should be operational by the end of the maintenance window.
Feel free to comment directly if you see issues past the planned downtime.
We will have a scheduled maintenance window tonight from 5:00pm to 11:00pm PST. The following changes will take place:
- 6:00pm PST (0100 UTC) Breakpad upgrade. We will be upgrading the breakpad environment to the next release (bugs fixed). In addition, we will also be making changes to the storage layout to help back-end storage scaling. The crash collector should not be impacted during the upgrade. The reporter however may be unavailable during parts of the upgrade. Please contact us in #breakpad if you notice issues past 9pm. Duration 3 hours.
- 6:00pm PST (0100 UTC) support.mozilla.com update. We’ll be updating
support.mozilla.com to pick up code updates (bug 532232). Duration 2 hours.
Please let me know if you have any reason why we should not proceed with this planned maintenance. As always, we aim to keep downtime to as little as possible, but unexpected complications can arise causing longer downtime periods than expected. All systems should be operational by the end of the maintenance window.
Feel free to comment directly if you see issues past the planned downtime.