Parsing qliksense healthcheck api results
2021-09-15
Abstract
When you do a stress test/troubleshooting of a qliksense node it is useful to collect the responses of the healthcheck api and extract some useful info from them (which and how many applications were loaded in memory, …)
Collecting data
I usually use the command line tool qsense for querying the Qliksense repository
while [ 1 ]
do
qsense healthcheck qlikhost1.redaelli.org ~/certificates/qlik/client.pem >> healthcheck.jl
sleep 60
done
Each line of the file healthcheck.jl is a json object like
{
"version": "12.763.10",
"started": "20210915T165938.000+0200",
"mem": {
"committed": 72283.234375,
"allocated": 118266.04296875,
"free": 436586.79296875
},
"cpu": {
"total": 0
},
"session": {
"active": 1,
"total": 63
},
"apps": {
"active_docs": [
"0599e6baa-3b4a-4648-bbdb-47013a02dc21"
],
"loaded_docs": [
"15e0547c-c4eb-4492-a1db-1603d8295423",
"163777f8-9582-46b0-9418-a01f2d71c32d",
"059e6baa-3b4a-4648-bbdb-47013a02dc21"
],
"in_memory_docs": [
"15e0547c-c4eb-4492-a1db-1603d8295423",
"163777f8-9582-46b0-9418-a01f2d71c32d",
"059e6baa-3b4a-4648-bbdb-47013a02dc21"
],
"calls": 17126,
"selections": 300
},
"users": {
"active": 1,
"total": 6
},
"cache": {
"hits": 70,
"lookups": 70,
"added": 0,
"replaced": 0,
"bytes_added": 0
},
"saturated": false,
It can be useful to show application names instead of their Ids. So we can download the dataset with
qsense entity qlikhost1.redaelli.org ~/certificates/qlik/client.pem app --filter "published eq true" > app.json
Extracting some info
With the following script
python healthcheck.py healthcheck.jl healthcheck-out
you can extract a csv with the info
timestamp, mem_free, session_active, session_total, users_active, users_total, app1, app2, app3, ...
Below the source of the script
## file healthcheck.py
import sys
from pyspark.sql import SparkSession
from pyspark.sql import functions as F
infile = sys.argv[1]
outfile = sys.argv[2]
spark = SparkSession.builder.getOrCreate()
df = spark.read.json(infile)
apps = spark.read.json("apps.json")
df.select(F.col("now"),
F.col("mem.free").alias("mem_free"),
F.col("session.active").alias("session_active"),
F.col("session.total").alias("session_total"),
F.col("users.active").alias("users_active"),
F.col("users.total").alias("users_total"),
F.col("apps.in_memory_docs"))\
.withColumn("id", F.explode(F.col("in_memory_docs")))\
.join(apps, 'id', how="left").withColumn("fullname", F.concat("name", "id")).select(["now", "mem_free", "session_active", "session_total","users_active", "users_total", "fullname"])\
.groupBy(["now", "mem_free", "session_active", "session_total","users_active", "users_total"])\
.pivot("fullname").count().coalesce(1).write.mode("overwrite").option("sep",";").option("header","true").csv(outfile)
When/how many times was the engine restarted? What happened just before?
import collections
import pprint
import re
import pandas as pd
import sys
def load_data(day, infile):
return pd.read_json(infile, lines=True)
def load_apps():
url = "app.json"
return pd.read_json(url)
def parse_healtcheck(infile):
def lists_diff(li1, li2):
return list(set(li1) - set(li2)) + list(set(li2) - set(li1))
df = load_data(infile)
started = None
active = loaded = last_active = last_loaded = []
for index, row in df.iterrows():
if row["started"] != started:
print("******************************************")
print("Engine started at {started}".format(started=row["started"]))
print("******************************************")
started=row["started"]
print("Previous active:")
print(active)
print("Previous loaded:")
print(loaded)
last_active = last_active + active
last_loaded = last_loaded + loaded
new_active = row["apps"]["active_docs"]
new_loaded = row["apps"]["loaded_docs"]
delta_active = lists_diff(new_active, active)
if delta_active != []:
pprint.pprint(row["now"] + ": new active apps: " + str(delta_active))
delta_loaded = lists_diff(new_loaded, loaded)
if delta_loaded != []:
pprint.pprint(row["now"] + ": new loaded apps: " + str(delta_loaded))
active = new_active
loaded = new_loaded
pprint.pprint("Latest loaded apps: " + str(collections.Counter(last_loaded)))
pprint.pprint("Latest active apps: " + str(collections.Counter(last_active)))
pprint.pprint("Latest restarts: " + str(df.started.unique()))
if __name__ == "__main__":
# execute only if run as a script
infile = sys.argv[1]
parse_healthcheck(infile)