Professional Context
I still remember the frustrating night I spent debugging a critical latency issue in our cloud-based application, only to realize that a simple misconfiguration in our AWS setup was the culprit. It was a painful reminder that even the smallest oversight can have a significant impact on performance. As I delved deeper into the issue, I wished I had a reliable tool to help me identify the root cause and provide a clear plan for optimization.
💡 Expert Advice & Considerations
Don't waste your time trying to use Perplexity as a replacement for human judgment - use it to augment your existing workflows and automate tedious tasks, like generating boilerplate code or analyzing log data.
Advanced Prompt Library
4 Expert PromptsOptimize Cloud Resource Allocation
Given a cloud-based application with 5000 users, 1000 requests per second, and an average response time of 200ms, use queuing theory and simulation modeling to determine the optimal number of instances, instance types, and autoscaling policies to meet a 99.99% uptime SLA and minimize costs. Provide a detailed report including instance type recommendations, scaling policies, and estimated costs. Assume a 20% daily usage spike and a 10% monthly growth rate.
Design a Real-Time Data Processing Pipeline
Design a real-time data processing pipeline using Apache Kafka, Apache Storm, and Apache Cassandra to handle 100,000 events per second from IoT devices. The pipeline should support event aggregation, filtering, and transformation, and provide a data model for storing and querying the processed data. Assume a 10-node Kafka cluster, a 5-node Storm cluster, and a 10-node Cassandra cluster. Provide a detailed architecture diagram, component configurations, and a sample data model.
Conduct a Root Cause Analysis of a System Failure
A critical system failure occurred, resulting in a 2-hour downtime and significant revenue loss. Using the 5 Whys method and fault tree analysis, identify the root cause of the failure and provide a detailed report including the failure timeline, contributing factors, and recommended corrective actions. Assume the system consists of a load balancer, 5 web servers, 2 database servers, and a caching layer. Provide a failure probability estimate and a prioritized list of recommendations for preventing similar failures.
Develop a Machine Learning Model for Predictive Maintenance
Develop a machine learning model using scikit-learn and TensorFlow to predict equipment failures based on sensor data from 1000 machines. The model should support real-time prediction, anomaly detection, and automated alerting. Assume a dataset with 100 features, 100,000 samples, and a class imbalance ratio of 1:10. Provide a detailed model architecture, training and evaluation metrics, and a sample Python implementation. Use a combination of supervised and unsupervised learning techniques to improve model robustness.