Week 3 – Day 2

Today was not fun 😩

Yeah, it was not fun but I did learn something. As I said yesterday, my main task now is to clearly define my goal or the anomalies I will be looking for by analyzing the logs. More than defining them, I have to answer the “how” question: how am I going to use the available information into the logs to detect my target anomalies?

Alright, now why was the day not fun? Well, I spent most of the day trying to understand the text in the logs. I can ensure you, it was not fun and it was the first time for me to do such activity.

However, it was not fun until I started understanding some outputs. 🙂 I was able to:

  • distinguish disks and systems checks which were done periodically (per minute)
  • distinguish what is recorded in the log file when a request in emitted either for Thor or Roxie.
  • find the log file which registers the info about Roxie Queries run on WsECL (ESP.log)

With what I have learned today, I developed some ideas about anomalies I can target. For example, Roxie is said to be the Rapid Data Delivery Engine. Hence, I may look for queries which takes too much time by analyzing the running time of queries in the log file. My analysis should help me set the baseline of what is considered an appropriate running time and use that to outline “anomalous queries”.

That is currently one of my ideas. I have to clarify everything tomorrow and format it well in a document which I will submit to my mentors for review. I do feel better than yesterday.

Hopefully, I am heading the right way 😉

Week 3 – Day 1

Hi there,

Hope you started your week well 🙂

As far as I am concerned, I am making little progress towards an unclear destination. In fact that was the main topic of my meeting today with my mentors about the project progress. Indeed, I did worked last week on an approach to produce a log-based anomaly detection system. However, I realized my procedure or approach had no specific target (yet). My target anomalies are not yet well defined.

To recall, I intend to build an anomaly detection system which would autonomously detect frauds in a log file and more precisely a Roxie log file. Before starting the project, I did some research about the topic and found many approaches where the developers did not have to understand the logs at all. However, to build their system, they did got some insights from subject matter experts or simply said, guys who had some understanding about the logs.

That is where my task for this week emerges. Now, I have to convert myself into a subject matter expert. I have to understand the Roxie Logs. My advisor did tell me about that requirement during my first meeting but I did not understand what he meant at that time. Now, I am reading the documentation on Roxie, re-watching the Online Classes on Roxie and also will (tomorrow) run queries on Roxie and try to understand the output in the log file. Hopefully, I will be able to get some heuristics and be able to pinpoint which anomalies I can detect through Roxie log analysis.

See you tomorrow and stay blessed 😉

Week 2 – Day 5

Last day of week!

Overall, the week has been fruitful. I am currently done with the Online Roxie classes (Yeah). Morevoer, I did a significant progress both on the solution design document and also the log parsing code. However, those two would be reviewed tomorrow morning during an audio conference call with my mentors.

To sum up, I did achieved about 90% of my objectives of the week. Tomorrow, after the meeting I will have a better perspective about the next steps.

See you tomorrow and stay blessed as always 🙂

Week 2 – Day 4

Today was an even “more interesting day”.

Again, I moved forward with the online lessons. It is now remaining me four lessons in the Roxie Online class part 2. After that I will be fully working on project implementation I think.

Concerning the solution design, I produced the document and improved it several times with precious and remarkable help of my mentor. I can say the document is 90% done. My supervisor still have to make remarks or suggestions on the last document update I sent her today.

Another important realization of the day was the log parsing code which I got done. My mentors still have to check the code still.

I am slowly moving from the learning and designing phase to implementation phase. I cannot wait to see the concrete solution working using HPCC Systems. I will share the Git Repo link here as soon as I get all approvals (solution design and code).

See you tomorrow for another day with LexisNexis 😊

Week 2 – Day 3

I would qualify this day as “interesting”.

I did finish the Roxie Course part 1 and started the Roxie course part 2. Part 2 has 12 lessons and I finished the first 4. I think I am on track to finish the online classes this week as scheduled.

Moreover, I have started the parsing code. I should have a kind of final version tomorrow which I will share with my mentors through the Git Repo I created today as well.

Also, I discussed with my mentor about the solution design. I sent her my approach and my understanding of the approach. Just as a reminder, I am trying to find anomalies in a log file using Unsupervised Learning Methods. In the approach I proposed, I used a random log sequence to demonstrate the approach. However, she told me to demonstrate using Roxie Logs which are the ones I would be officially using for the project and which I am collecting daily. I went into one of those logs and made some “interesting” remarks. I cannot say I understand those logs but I have found some patterns which could be very useful for my solution design. I did sent my observations to my mentor as well as some questions. I am waiting on her reply to improve and finalize my solution design.

Stay blessed and see you tomorrow 🙂

Week 2 – Day 2

Welcome to day 2 of week 2!

Today was nice, cool!

I did continue with the Roxie Course part 1. Part 1 is made of 14 lessons and I have the 10 first done. I should be done with Part 1 tomorrow therefore. Moreover, I almost finished the solution design document. I would revised it tomorrow and most probably send it to my mentors in the evening. The “cool” part of the day was when I got my little gift from LexisNexis as intern. Just to recall, I am working remotely from Kennesaw State University. They sent me a cool bag and many more. Awesome 🙂

Tomorrow’s plan is to

  • Finish part 1 of the Roxie classes
  • Finalize the design document
  • Create a Git Repo for the code and share the link with my mentors.
  • I will start writing the parsing code as soon the solution strategy and parsing approach gets approved.

Stay blessed 🙂

Week 2 – Day 1

Welcome to Week 2!

Here are my weekly objectives:

1- Parse Roxie Logs
2- Create the design Document
3- Finish the Online Classes

Today, I did started the Roxie classes and recorded the generated Logs. Moreover, I looked into the collected Roxie Logs to see how to go about parsing them. I have seen some interesting patterns already. Also, I have thought about the solution design which is not yet fully clear so far.

Tomorrow, I will start writing the design document hoping it will make things clearer. The design document will also help me validate my parsing approach. I am avoiding writing code without having a somehow clear idea of my target. I will also obviously continue with the online classes.

See you tomorrow and stay blessed 🙂