Service Template

Availability

  • Who is allowed to use this service (Internal to the group, Everyone in the college, Multiple colleges, Entire University, Global)?

Life cycle

  • Living will - What happens when funding or support is removed for this service (data will be safely exported, data will be archived, VM will be archived, What will the recovery cost be for extracting data, will we allow users to export their own data for a period of time)?
  • OS patching frequency - How often will we update the underlying OS/OVA/Appliance/Firmware? Ideas: We will run on a 10 day lag, As soon as available, Next possible maintenance window. Who will do this?
  • Application updates - How often will we update the application providing the service. Ideas: We will run on a 10 day lag, As soon as available, Next possible maintenance window. Who will do this?
  • Sunset plan - How and when will we re-evaluate the service.

Dependencies

  • What equipment/services are required for this service to function
  • Are there significant services that are dependent on this service

Workflows

Service requests

  • Operational requests that should be able to be completed by anyone with access rights
  • Easy and quick
  • Completed in under an hour
  • Should occur relatively often
  • How should some one enter a service request

Change requests

  • Significant changes to the service
  • May require SME level knowledge
  • Long time to completion
  • May require restricted priviledges

Uptime

  • What time will the service be available (e.x. business hours, 4am to midnight)

Maintenance

Announcements

  • How will we go about notifying users of the service about routine maintenance (e.x. we will notify users of any outage 48 hours in advance)
  • Outage communications (e.x. we will send out an outage notification within 1 hour of the start of an outage, Additionally our monitoring will automatically update our status page.)

Update policy

  • Not sure what I meant by this as it was covered in the life cycle section.

SLA

  • Time to resolution on service request and change request (e.x. We will triage all requests with in 1 hour. All standard service request will be processed in under 8 business hours
  • Service survivability and recovery
    • Service survivability
      • How precarious is this service
      • e.x. The service is based out of the Pullman Campus in the Primary data center facility. If there is any disruption to this facility then the service will be unavailable.
      • e.x. The service is based on a distributed architecture with intelligent load balancing technology to allow for significant outages without causing an outage.
    • Service Recovery
      • If the underlying services are not able to be repaired what is the recovery plan.
      • e.x. We will reimplement the service using off-premise equipment and retrieve data from our backup solution
  • How will this service be monitored
    • What steps will be taken to assure the SLA and other items in this documentation.
    • e.x. We will use UptimeRobot and Boson to monitor health metrics originating from the underlying equipment to provide real-time information to our alerting systems.

Data access policy

  • Under what circumstances can CIT look at this data
  • e.x. CIT will not access any data belonging to customers without explicit permission from the data/service owner. This may delay troubleshooting activities while permission is sought. CIT is allowed to access and review metadata about the data/service to try and aid in rapid recovery. If there is a case not covered here the decision will fall in favour of the data owner.

Data stewardship

  • Back up
    • What will we do for backups?
    • e.x. We will use our standard backup system to take system level backups of the underlying OS.
    • e.x. We will backup the code repositories and database independently on a daily basis. This will allow us the ability to reconstitute the service on another platform within 4 hours.
  • Retention
    • How long will we keep the data?
    • e.x. We will utilize our standard backup schedule. You can refer to insert link here to find out our backup standards.
  • Survivability
    • What is the survivability of our data backup.
    • e.x. Please refer to insert link here for our data survivability standards
    • e.x. We will utilize our off campus backup facility for this service. This will allow for data survivability in the case of the loss of the Pullman campus.

Appropriate use policy

  • What is appropriate use of this service? What is not?
  • e.x. This service is meant to be used according to standard practices and should not be viewed as a space to do a horrible thing that no one in their right mind would ever want to do.
  • e.x. Don’t put 10 TB of incompressible bin files in gitlab

Budget

  • TCA
    • Try to understand what the cost of this service may be. It would be appropriate to use an excel spreadsheet to help with this.
    • You should take into account the whole lifecycle of the service from standup to complete removal and archival of the data. You can say that this will be reimplemented using current best practices.
    • Do try to account for the following phases of the life cycle: implementation, running cost, de-provisioning.
  • Recovery
    • Will the cost of this service be able to be recovered. This should be reconsidered if there is ever a change in scope of the service offering.
    • e.x. We will recover the cost of service via a per repository charge of $x. This will provide for ongoing maintenance and expansion of required equipment.