Simple APM

an open source/in-house application performance management server that monitors your services and allows you to perform historical/real-time analysis for your endpoints

setup

First make sure to init schema in both cassendra and elastic search

./elastic_init_index.sh

in cqlsh

create keyspace simple_apm with replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
use simple_apm;

create table request_info (service_name text, url text, method text, status smallint, response_time int, created_at timestamp, primary key (service_name, created_at, method, url));

To run server

# run one instance of sever to handle http request and push jobs in queues
make run_server

To run worker

# you need two types of workers online at least
make run_worker job_type=ES_SYNC target_queue=sync_es batch_size=5 # elastic search sync worker
make run_worker job_type=DB_WRITE target_queue=cassendra_write batch_size=5 # cassendra DB write worker

params:

batch_size number of jobs to handle by worker per one exec
target_queue the job queue worker will listen to
job_type the type of job the spawned worker will handle

Environment Variables

List of envs needs to be setup before starting the service

PRODUCTION_QUEUES comma separated list of queues to push jobs into from server/producer
REDIS_CONNECTION_URL connection url to setup redis client
JWT_SECRET secret used to sign and verify tokens between sdk/server
CASSENDRA_KEY_SPACE name of the keyspace the tables are stored
CASSENDRA_HOSTS number of cassendra nodes that are under simple-apm
SERVER_PORT port used by http server
ES_BULK_INSERTION_CONCURRENCY max concurrency level use in bulk es index updates

SDK

JS: <github.com/Kareem-Emad/simple-apm-express>

custom SDK

You can build your own custom sdk in whatever lang/way you want, just make sure you

sign a token by the same secret set here in the envs
use the same request format

curl --location --request POST 'http://localhost:5000/requests' \
--header 'Authorization: Bearer jwt_token_placeholder' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "https://google.com",
    "http_method": "GET",
    "response_time": 3000,
    "service_name": "service_name",
    "status_code": 200,
    "created_at": "2020-04-02 02:10:01"
}'

make sure you use the same date format as Elastic search is optimzied for this layout specificly as shown below

Cassandra DB Schema

The database contains one table called request_info, containing Fields:

Field	Data Type	Description
`service_name`	`text/varchar`	name of the service that contain this endpoint
`url`	`text/varchar`	full url of the route in this service
`method`	`text/varchar`	type of the request handled by the endpoint
`status`	`small_int`	code returned to the requestor
`response_time`	`int`	time taken from request recieval in the server to responding to the client
`created_at`	`timestamp`	the timestamp when this request was done

Indexing

We have two types of indexes:

partition index set to service_name column
cluster_index set to the (created_at, method, url) in the order respectively

Note that accessing the data in cassendra should be optimized to use the index in its order to avoid latencies So it's better to always start by specifying service_name, range of dates created_at, http method method, and finally the endpoints you want to include in the query result url.

Elastic Search mapping

Field	Data Type
`service_name`	`keyword`
`url`	`text`
`http_method`	`keyword`
`status_code`	`short`
`response_time`	`long`
`created_at`	`date` in format `yyyy-MM-dd HH:mm:ss`

Acessing analytics

To access data through elastic search server, there are already some ready to use queries to fetch important data like throughput, average_response_time, min/max_response_time, x_percentile

throughput

To calculate throughput, we need to use histograms in elastic search. Following on the assumption that throughput is the sucessfull number of requests handled per miute, we can achieve it with a query/aggregate like this one

curl -X GET "localhost:9200/request_info/_search?pretty" -H 'Content-Type: application/json' -d'
{
    "query": {
      "bool": {
        "must": [ {"match": {"service_name": "YOUR_SERVICE_NAME"}} , {"match":{"url": "SOME_URL"}}, {"match":{"http_method": "GET"}}],
        "filter": {
          "range": {
            "created_at": {
              "gte": "START_DATE IN FORMAT (2020-04-01 02:01:01)",
              "lte": "END_DATE IN FORMAT (2020-04-10 02:01:01)"
            }
          }
        }
      }
    }, "aggs": {
		  "throughput_over_time": {
                "date_histogram": {
                "field": "created_at",
                "fixed_interval": "1m"
            }
          }
    }
}
'

Note that you can omit some of the matche queries in the must to get calculate the througput curve over your whole service or a specific set of endpoints

Also notice that the interval here is tunable through fixed_interval field, meaning you can calculate the throughput of total successfull requests per minute/hour/day/month/etc

stats(average/min/max/x_percent response time)

To get the full stats for a certain endpoint(s)/service, we can use this query/aggregate:

curl -X GET "localhost:9200/request_info/_search?pretty" -H 'Content-Type: application/json' -d'
{
    "query": {
      "bool": {
        "must": [ {"match": {"service_name": "YOUR_SERVICE_NAME"}} , {"match":{"url": "SOME_URL"}}, {"match":{"http_method": "POST"}}],
        "filter": {
          "range": {
            "created_at": {
              "gte": "START_DATE",
              "lte": "END_DATE"
            }
          }
        }
      }
    }, "aggs": {
          "request_percentiles": {
              "percentiles": {
                  "field": "response_time" 
              }
          },
          "request_stats":{
              "stats":{
                  "field": "response_time"
              }
          }
    }
}
'

Note that you need to specify the period of your search in both curls in the range filter to get your stats within a certain timeline (last week/month/...)

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
auth		auth
consumer		consumer
dal		dal
elasticsearchmanager		elasticsearchmanager
envreader		envreader
producer		producer
server		server
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
elastic_init_index.sh		elastic_init_index.sh
example.env		example.env
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple APM

setup

Environment Variables

SDK

custom SDK

Cassandra DB Schema

Indexing

Elastic Search mapping

Acessing analytics

throughput

stats(average/min/max/x_percent response time)

About

Releases

Packages

Languages

License

Kareem-Emad/simple-apm

Folders and files

Latest commit

History

Repository files navigation

Simple APM

setup

Environment Variables

SDK

custom SDK

Cassandra DB Schema

Indexing

Elastic Search mapping

Acessing analytics

throughput

stats(average/min/max/x_percent response time)

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages