Azure instance memory hog
Azure instance memory hog disrupts the state of infrastructure resources.
- It induces stress on the Azure Instance using the Azure
Run
command. The AzureRun
command is executed using the in-built bash scripts within the fault. - It utilizes memory in excess on the Azure Instance using the bash script for a specific duration.
Use cases
Azure instance memory hog:
- Determines the resilience of an Azure instance when memory resources are unexpectedly utilized in excess.
- Determines how Azure scales the memory to maintain the application when resources are consumed heavily.
- Simulates the situation of memory leaks in the deployment of microservices.
- Simulates a slowed application caused by lack of memory.
- Simulates noisy neighbour problems due to hogging.
- Verifies pod priority and QoS setting for eviction purposes.
- Verifies application restarts on OOM (out of memory) kills.
Prerequisites
- Kubernetes >= 1.17
- Azure Run Command agent is installed and running in the target Azure instance.
- Azure instance should be in a healthy state.
- Use Azure file-based authentication to connect to the instance using Azure GO SDK. To generate the auth file ,run
az ad sp create-for-rbac --sdk-auth > azure.auth
Azure CLI command. - Kubernetes secret should contain the auth file created in the previous step in the
CHAOS_NAMESPACE
. Below is a sample secret file:
apiVersion: v1
kind: Secret
metadata:
name: cloud-secret
type: Opaque
stringData:
azure.auth: |-
{
"clientId": "XXXXXXXXX",
"clientSecret": "XXXXXXXXX",
"subscriptionId": "XXXXXXXXX",
"tenantId": "XXXXXXXXX",
"activeDirectoryEndpointUrl": "XXXXXXXXX",
"resourceManagerEndpointUrl": "XXXXXXXXX",
"activeDirectoryGraphResourceId": "XXXXXXXXX",
"sqlManagementEndpointUrl": "XXXXXXXXX",
"galleryEndpointUrl": "XXXXXXXXX",
"managementEndpointUrl": "XXXXXXXXX"
}
If you change the secret key name from azure.auth
to a new name, ensure that you update the AZURE_AUTH_LOCATION
environment variable in the chaos experiment with the new name.
Mandatory tunables
Tunable | Description | Notes |
---|---|---|
AZURE_INSTANCE_NAMES | Names of the target Azure instances. | Multiple values can be provided as a comma-separated string. For example, instance-1,instance-2. For more information, go to stop instance by name. |
RESOURCE_GROUP | The Azure Resource Group name where the instances will be created. | All the instances must be from the same resource group. For more information, go to resource group field in the YAML file. |
Optional tunables
Tunable | Description | Notes |
---|---|---|
TOTAL_CHAOS_DURATION | Duration that you specify, through which chaos is injected into the target resource (in seconds). | Defaults to 30s. For more information, go to duration of the chaos. |
CHAOS_INTERVAL | Time interval between two successive container kills (in seconds). | Defaults to 60s. For more information, go to chaos interval. |
AZURE_AUTH_LOCATION | Name of the Azure secret credentials files. | Defaults to azure.auth . |
SCALE_SET | Check if the instance is a part of Scale Set. | Defaults to disable . Also supports enable . For more information, go to scale set instances. |
MEMORY_CONSUMPTION | Amount of memory to be consumed in the Azure instance (in megabytes). | Defaults to 500 MB. For more information, go to memory consumption in megabytes. |
MEMORY_PERCENTAGE | Amount of memory to be consumed in the Azure instance (in percentage). | Defaults to 0. For more information, go to memory consumption in percentage. |
NUMBER_OF_WORKERS | Number of workers used to run the stress process. | Defaults to 1. For more information, go to multiple workers. |
DEFAULT_HEALTH_CHECK | Determines if you wish to run the default health check which is present inside the fault. | Default: 'true'. For more information, go to default health check. |
SEQUENCE | Sequence of chaos execution for multiple target instances. | Defaults to parallel. Also supports serial sequence. For more information, go to sequence of chaos execution. |
RAMP_TIME | Period to wait before and after injecting chaos (in seconds). | For example, 30s. For more information, go to ramp time. |
Memory consumption in megabytes
It specifies the memory utilised (in MB) on the Azure instance. Tune it by using the MEMORY_CONSUMPTION
environment variable.
Use the following example to tune it:
# memory in mb to utilize
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: azure-instance-memory-hog
spec:
components:
env:
- name: MEMORY_CONSUMPTION
VALUE: '1024'
# name of the Azure instance
- name: AZURE_INSTANCE_NAMES
value: 'instance-1'
# resource group for the Azure instance
- name: RESOURCE_GROUP
value: 'rg-azure'
Memory consumption in percentage
It specifies the memory utilised (in percentage) on the Azure instance. Tune it by using the MEMORY_PERCENTAGE
environment variable.
Use the following example to tune it:
# memory percentage to utilize
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: azure-instance-memory-hog
spec:
components:
env:
- name: MEMORY_PERCENTAGE
VALUE: '50'
# name of the Azure instance
- name: AZURE_INSTANCE_NAMES
value: 'instance-1'
# resource group for the Azure instance
- name: RESOURCE_GROUP
value: 'rg-azure'
Multiple Azure instances
It specifies comma-separated Azure instance names that are subject to chaos in a single run. Tune it by using the AZURE_INSTANCE_NAMES
environment variable.
Use the following example to tune it:
# mutilple instance targets
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: azure-instance-memory-hog
spec:
components:
env:
# names of the Azure instance
- name: AZURE_INSTANCE_NAMES
value: 'instance-1,instance-2'
# resource group for the Azure instance
- name: RESOURCE_GROUP
value: 'rg-azure'
Multiple workers
It specifies the CPU threads that are run to spike the memory utilisation. As a consequence, this increases the memory consumption. Tune it by using the NUMBER_OF_WORKERS
environment variable..
Use the following example to tune this:
# multiple workers to utilize resources
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: azure-instance-memory-hog
spec:
components:
env:
- name: NUMBER_OF_WORKERS
VALUE: '3'
# name of the Azure instance
- name: AZURE_INSTANCE_NAMES
value: 'instance-1'
# resource group for the Azure instance
- name: RESOURCE_GROUP
value: 'rg-azure'