среда, 11 декабря 2019 г.

WHAT IS DEVOPS?

DevOps is a collaboration between the development and operations team members.
The core of DevOps is CAMS ( Culture, Automation, Measurement, Sharing) approach.

DevOps principals

System Thinking - you need understand the whole system to optimize it well.

Feedback loops. Amplifying feedback loops, is all about creating, shortening, and amplifying feedback loops between the parts of the organization that are in the flow of that value chain. A feedback loop is simply a process that takes its own output into consideration when deciding what to do next

Experimentation and Learning - master skills and tools that already part of your knowledge
Find new skills & trying them


DEVOPS METHODOLOGIES

1) People over Process over Tools.
 It recommends identifying who's responsible for a job function first. Then defining the process that needs to happen around them. And then selecting and implementing the tool to perform that process.


2) Continuous Delivery.
It's the practice of coding, testing, and releasing software frequently, in really small batches so that you can improve the overall quality and velocity.

3) Lean Management.
It consists of using small batches of work, work in progress limits, feedback loops and visualization. The same studies showed that lean management practices led to both better organizational outputs, including system throughput and stability and less burn out and greater employee satisfaction at the personal level.

4) Change control.
Visible Ops - it describes a light and practical approach to change control. It focused on an emphasis of eliminating fragile artifacts, creating a repeatable build process, managing dependencies and creating an environment of continual improvement.

5) Infrastructure as code
One of the major realizations of modern operations is that systems can and should be treated like code. System specifications should be checked into source control, go through a code review whether a build, an automated test, and then we can automatically create real systems from the spec and manage them programatically. With this kind of programatic system, we can compile and run and kill and run systems again, instead of creating hand-crafted permanent fixtures that we maintain manually over time.


DEVOPS PRACTISES

10) Incident Command system
9) Developers on call
8) Status Pages - create status pages and communicate with users when an issue arises. So users understand what is going on
7) Blameless Postmortems - its a meeting. Do it withing 48 hours of incident, if possible. Have a third party run it. Set to UTC all timelines.
  • A description of incident
  • A description of the root cause
  • How the incident was stabilized or fixed?
  • A timeline of events, including actions taken. UTC time
  • How the incident affected customer?
  • Remediation and corrective

6) Embedded Teams
 How to get it if team is too big or distributed ?
  • Chat rooms
  • Wiki pages
  • Source code (read)
  • Infrastructure
  • Monitoring tools
  • Ticket tracker

5) The cloud -  cloud solutions can
4) Andon Cords - if something now right everyone can stop ship before deploy on production
3) Dependency Injection -
2) Blue/Green Deployment -
1) Chaos monkey  will trash or kill your server during real-time forcing your engineers to find methods to make the server robust and tolerant to “instance failures”.

Kaizen -
Gam Bo

1) Plan (Problem finding) -> Do (Display) -> Check (CLear) -> Act (Acknowledge)
2) 5 Why

Provisioning. Is the process of making a server ready for operation. Including hardware, OS, system services, and network connectivity.
Deployment  is the process of automatically deploying and upgrading applications on a server.
Orchestration, is the act of performing coordinated operations across multiple systems.
Configuration management is an over arching term, dealing with change control of system configuration after initial provisioning. But it's also a often applied to maintaining and upgrading applications and application dependencies

How tools approach configuration management


  • Imperative, also known as procedural. This is an approach where commands desired to produce a state are defined and then executed.
  • Declarative, also known as functional. This is an approach where you define the desired state, and the tool converges the existing system on the model. 
  • Idempotent. This is the ability to execute the CM procedure repeatedly. And end up in the same state each time.
  • Self service, is the ability for an end user to kick off one of these processes without having to go through other people.

DEPLOYMENTS

Canary deployment. Where you upgrade one server in a fleet, and see how it works before upgrading the rest.
Blue green deployment. Where you have two identical environments. One of which is production, and one of which is staging. New code is put under the staging environments. And then the two environments are swapped. There are variants on this practice like, cluster immune system deployment.
Immutable deployments. Where you never upgrade software in production at all. You discard old virtual systems and put new ones in place.

Infrastructure as a code tool-chain

Configuration Management
Chef
Puppet
Ansible
Salt
CFEngine
Packer

Services Directory Tools
etcd
ZooKeeper
Consul

Containers
Docker Swarm
Google Kubernatis
Apache Mesos

Private Container Services
Rancher
Google Cloud Platform
Amazon Web Services ECS


Key Success Metrics

  • Deployment frequency
  • Lead time for changes
  • Change failure rate
  • Mean time to recovery

Design for operations

The Twelve Factors - manifesto how to produce software
https://github.com/factorish/factorish - help to make your app more 12 factor

The best way to avoid failure, is to fail constantly.
What is the Chaos Monkey do.

Operate for design

How complex system fail?

Change introduces new forms of failure
Complex systems contain changing mixtures of failures latent whithin them.
All complex systems are always running in degraded mode


Monitoring by LEAN
1) Set up base minimum  monitoring
2) Measure - get metrics
3) Analyze results
4) Repeat

MONITORING AREAS

1) Service performance and uptime - monitoring is implemented at the very highest level of a service set or application. These are often referred to as synthetic checks or you know, and they're synthetic because they're not real customers or real traffic. It's the simplest form of monitoring, to answer the question of, is it working.

2) Software component - this is monitoring that's done on ports or processes, usually located on the host. This moves into layers, so instead of answering, is my service working, it's asking, is this particular host working.

3) System metrics - they can be anything from, like, CPU or memory. These are time series metrics, and they get stored and graphed where you can look at them and answer the question, is this service or host or process, is it functioning normally?

4) App metrics - are telemetry from your application that give you a sense of what your application is actually doing. Examples: when you emit how long a certain function call is taking, or maybe the number of logins in the last hour, or account of all the error events that have happened.

5) Performance - all of the previous types of metrics are hints of performance. metrics. Tied through all of the previous types of metrics are hints of performance.
APM is an instrumentation framework that isolates function performance at the code level. Real user monitoring(RUM) usually uses front-end instrumentation, for example, like a JavaScript page tag. It captures the performance observed by the users of the actual system. It's able to tell you what your customers are actually experiencing. This is opposed to synthetic checks, which tell you what customers are probably experiencing.

6) Security monitoring includes four key areas:

  • System security, think of things like Bad TLS, SSL settings. Open ports and services, or other system configuration problems
  • Application security, this is like knowing when XSS or SQL injection are in (mumbles) on your site.
  • Custom events in the application, things like password resets, invalid logins, or new account creations.
  • Anomalies


What are the flows in your system that would be indicators of compromise, or that people would want to abuse if they could? And anomalies, you know, when you're seeing HTTP 401s or "Access stems from irregular IP segments." Alright, let's move on. In certain areas of monitoring, we mentioned metrics

LOGS
What happen?
When did it happen?
Where did it happen?
Who was involved?
Where did that entity come from?

Principals of logging
1) Do not collect logs if you dont plan yo use it
2) Retain log data for as long as it is conceivable that it can be used
3) Log all you can but alert only on what you must respond to (Custom Experience Problems & Security Problems)
4) Dont try to make your logging more secure than your production stack
5) Logs change. New versions of software bring changes.

Nagios, Zabbix
Saas monitoring (Pingdom, Datadogm Neyuuitive, Ruxit, Librato, New Relic, AppDynamics
Open Source Monitoring (graphite, graphana, statsd, ganglia, InfluxDb, OpenTSDB, metrics.dropwizard,io)
Icinga, Sensu (Nagios like)
Premetheus, Sysdig

Log Management:
ELK
Pagerduty, VictorOps
Flapjack

DevOps News

devopsweekly.com
devops.com
infoq.com/devops
dzone.com/devops-tutorials-tools-news
devopscafe.org
theshipshow.com
arresteddevops
foodfightshow.org
devopsmastery.com
kitchensoap.com
itrevolution.com/devops-blog
dev2ops.org
continuousdelivery.com/blog
jedi.de/blog


Source:
https://www.linkedin.com/learning/devops-foundations/

Комментариев нет: