### Data Candidates for Cloud
-
### Conversation
- cloud db options
- azure VM
- more expensive, requires to manage many more stuff
- azure sql database
- pull multiple DB instances together to make a server
- azure sql managed instance
- create a server that we
- has tools for server migration
- can do linked servers, ssis agent jobs
- pricing
- existign sql contracts discount
- longer term contracts discount
- (josh) in cloud we'd need to rethink data manipulation, warehousing, history, replication
- no need to adopt new ETL tools
- benefits to cloud
- low maintenance. (josh) willing to pay a higher price to not have to server migrations and think about updates
- we miss on some of the value proposition of scalability
- (adriana) redefine roles we'll need
- infrastructure still
- more focus on security as we are not protected by the firewall
- training
- (josh) I'm expecting infrastructure to find someone with cloud experience to fill in for Theresa and Bill. This will limit need for infra to touch our servers and setup new ones.
- no sense for cloud storing archival data
- computational only data load
- types of data: transactional, archival, computational
- Josh prefers full cloud or full on-prem
- for machine learning sandbox
- draw from common cloud source will cache data to run learning
- (arash) guessing exploratory data analysis that would work better on the cloud
- (arash) analytical machine learning on-prem
- (adriana) high performance computing you need the data in RAM
- (arash) our scale -> on-prem analysis will be more cost friendly
- how to try this out with minimal risk
-
### My Notes
- benefits
- don't manage infrastructure (easier): Database as a Service aspects: updates, security patches, monitoring, and backups
- reduced dependence on infrastructure?
- scalability: dynamically growing capacity to meet demand
- flexibility:
- not limited by physical hardware, licenses.
- more easily deploy new services.
- New toys like NoSQL, serverless, cloud-based IDE
- integration with other cloud resources: AI/ML, monitoring, analytics, cloud backups and restore, PowerBI, Data Sources (CRM, sharepoint)
- costs
- makes me nervous, I don't know how to confidently answer the question of how much this will cost
- how much are we paying now?
- what are the expectations of decision makers around cloud costs?
- concern it's going to be harder to do my job
- 66% of engineering teams report a lack of visibility into cloud costs causing some level of disruption to their work
- on-premise: hardware (servers, networks, firewalls) , licensing, additional software (redgate, backup, recovery), maintenance.

- common cases where cloud doesn't make sense
- extreme low latency needs
- need to comply with strict compliance regulations: data residency, regulated data
- companies with substantial initial on-prem investments
- orgs with stable and predictable workloads
- initial work
- configure db: deployment scripts
- instance type: choose compute capacity (cpu, ram) and storage options.
- autoscaling options
- ~~high availability: configure multi-az deployment or replication for fault tolerance.~~
- backups: set up automated backups and retention policies.
- Azure Database Migration Service: move data from on-premise to cloud
- enable monitoring for key metrics and range based notifications
- cpu usage
- memory usage
- disk i/o
- latency + execution time
- network traffic
- ongoing effort
- security tasks: maintain secure access controls
- regularly verify backup and restore
- periodic performance tuning tasks
- configuration tasks
- ETL transition to cloud
- Azure Data Factory
- Redefine ETL processes to leverage cloud-native services (e.g., serverless functions, data lakes)
- Use cloud-native scheduling tools to manage ETL jobs.
- what cloud resources are you most interested in utilizing at work?
- compute: VMs, containers, serverless(functions as a service)
- storage: blob, persistent disks, files
- databases: relational, NoSQL, warehouse, ETL
- networking: environment (VPCs , subnets), load balancers, content delivery networks
- security: identity management, encryption, secrets, firewalls
- analytics: data lakes, machine learning, visualization, reporting,
- monitoring: notifications, logging