Hybrid

Data/ML Architect

Location: USA / Job Type: Full Time

Please email jobs@purgo.ai

About the Role


The Data Architect role offers the successful candidate the opportunity to pioneer the adoption of generative AI in design, development and migration of data applications. This is a hands-on technical role involving deep collaboration with both Purgo AI’s product/engineering team and its customers/partners. The role drives the maturation and adoption of Purgo AI’s redefined software design lifecycle amongst both cloud data warehouse partners and customers.


Requirements


Responsibilities


  • • Identify and quantify customer business requirements across the product’s use-cases.
  • • Design Purgo AI’s generative AI powered software design lifecycle solutions for the identified use-cases across leading cloud data warehouses like Snowflake and Databricks.
  • • Architect automated machine learning training pipelines and machine learning inference setup dynamically from data.
  • • Architect text to interference using NLP to perform online ML inference.
  • • Architect, implement, and optimize automated data pipelines, ETL processes, model workflow and data warehousing solutions.
  • • Design, document and advocate best practices for data architecture, AI modeling, and data integration for accelerated adoption of Purgo AI solutions.
  • • Conduct technical assessments, develop proofs-of-value, and present automated solutions for driving optimized paths of product adoption within customers.
  • • Continue to build deep subject-matter expertise in latest industry trends and advancements in cloud data platforms, code generation LLMs, text-to-query and related technologies.
  • • Architect and implement security, compliance, and data governance standards in all Purgo AI solutions.

About you


  • • Graduate education background in computer science or information technology.
  • • Minimum of 5 years of experience in a Data Architect or Solutions Architect role with a focus on implementing cloud data warehouse (e.g. Snowflake, Databricks) solutions.
  • • Enterprise customer facing experience in driving large scale cloud data warehouse implementations/migrations is strongly preferable.
  • • Strong expertise in ML Training and ML inference in various algorithms such as classifications, regression, neural networks.
  • • Strong expertise in NLP and text processing to ML interfaces.
  • • Strong expertise in data warehousing, ETL processes, AI modeling, and data integration techniques.
  • • Proficiency in programming languages such as SQL, Python, and Scala.
  • • Experience with cloud platforms such as AWS, Azure, or Google Cloud.
  • • Deep understanding of data modeling, AI algorithms (e.g. forecasting, anomaly detection etc.), data governance, and data security principles.
  • • Excellent problem-solving skills and the ability to work in a fast-paced, collaborative environment.
  • • Strong communication and presentation skills, with the ability to effectively convey technical concepts to non-technical stakeholders.
  • • Relevant certifications in Snowflake, Databricks, AI, or cloud platforms are a plus.


About the Company


Purgo AI is an application design studio powered by generative AI, which interprets high-level business problem statements and generates design, source code and deployment of data applications over cloud data warehouses. The product’s fine-tuned LLMs specialize in solving business problems with business intelligence (ETL/ELT), cloud migration and on-demand machine-learning & inferencing for forecasting, anomaly detection & pattern recognition. Purgo AI integrates out-of-box with all leading cloud data warehouses including Databricks, Snowflake, Microsoft Fabric, Google BigQuery and AWS RedShift.

• Business Intelligence (ETL/ELT): Business analysts feed new ETL/ELT user requirements through Purgo AI’s Jira app, which trigger the generation of behavior-driven design (BDD) requirements for solving the Jira tickets and test harnesses for providing QA. Purgo AI generates source code from integrated code-generation LLMs by using the BDD specifications without needing any human prompting. The generated code is subject to pre-generated quality assurance tests and test fails re-trigger generation of the source code. The final source code is ready for end deployment over the cloud data warehouse after any inspection/approval by the business analyst team. The entire process has end-to-end traceability through Purgo AI generated log entries across Jira and GitHub systems.

• Cloud Migration: Transformation teams interpret legacy data applications through Purgo AI’s GitHub plug-in, which triggers the generation of a cloud migration plan for stored procedures, data schema and data records for a specific target cloud data warehouse. The migration plan includes behavior-driven designs for each of the components, which are used to generate source code from integrated code-generation LLMs without needing any human prompting. Purgo AI leverages the generated migration plan for executing a methodical migration all the way to data migration. Purgo AI tests the migration with pre-generated tests that validate data, table relationships, procedures and data schema. The product is addressing a trillion-dollar legacy application market with this end-to-end solution.

• On-demand Machine-learning and Inferencing: Purgo AI interprets predictive business problem statements and generates the design for training an on-demand machine-learning (ML) model along with necessary training/test datasets. Purgo AI integrates with ML scripting frameworks like MLFlow and TensorFlow to execute training, testing and deploying new ML models custom-designed from the problem statement in the form of notebooks deployed over the cloud data warehouse. Purgo AI infers from the deployed model(s) by using the parameters described in the problem statement to generate a predicted solution for the business user. The product is capable of delivering this on-demand machine-learning and inferencing for business problems across forecasting, anomaly detection and pattern recognition use-cases.

The product integrates seamlessly with Jira, Github (to interpret existing/legacy source code), code LLMs (from Github, AWS, Mistral and Meta), and test automation platforms (like Selenium, Pytest etc.). The company’s co-founder and CTO, Sang Kim, has been an engineering leader across VMWare and Blackberry. The company is based in Palo Alto, CA and co-created by The Hive, a venture studio focused on data and AI in the enterprise.

Please email jobs@purgo.ai.

Purgo AI is an affirmative action employer and welcomes candidates who will contribute to the diversity of the company.