1,514 Pyspark Developer jobs in India

Python Pyspark

Chennai, Tamil Nadu Virtusa

Job Viewed

Tap Again To Close

Job Description

Overall Eight plus years of experience with min of five plus years experience in required technical skills

Required technical skills: Python, PySpark, SQL

AWS Cloud experience and Healthcare domain knowledge is desired

**About Virtusa**

Teamwork, quality of life, professional and personal development: values that Virtusa is proud to embody. When you join us, you join a team of 36,000 people globally that cares about your growth — one that seeks to provide you with exciting projects, opportunities and work with state of the art technologies throughout your career with us.

Great minds, great potential: it all comes together at Virtusa. We value collaboration and the team environment of our company, and seek to provide great minds with a dynamic place to nurture new ideas and foster excellence.

Virtusa was founded on principles of equal opportunity for all, and so does not discriminate on the basis of race, religion, color, sex, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status or any other basis covered by appropriate law. All employment is decided on the basis of qualifications, merit, and business need.
This advertiser has chosen not to accept applicants from your region.

Job No Longer Available

This position is no longer listed on WhatJobs. The employer may be reviewing applications, filled the role, or has removed the listing.

However, we have similar jobs available for you below.

PySpark Developer

Noida, Uttar Pradesh Anicalls (Pty) Ltd

Posted today

Job Viewed

Tap Again To Close

Job Description

• PySpark Developer / PySpark Data Engineer
• Should have at least three years of strong experience in PySpark
• Strong SQL Expertise
• Strong communication and client-facing skills
This advertiser has chosen not to accept applicants from your region.

PySpark Developer

Mumbai, Maharashtra Anicalls (Pty) Ltd

Posted today

Job Viewed

Tap Again To Close

Job Description

• PySpark Developer / PySpark Data Engineer
• Should have at least three years of strong experience in PySpark
• Strong SQL Expertise
• Strong communication and client-facing skills
This advertiser has chosen not to accept applicants from your region.

Pyspark Developer

Bengaluru, Karnataka Tiger Analytics

Posted today

Job Viewed

Tap Again To Close

Job Description

Job Description

Tiger Analytics is a global analytics consulting firm. With data and technology at the core of our solutions, we are solving some of the toughest problems out there. Our culture is modeled around expertise and mutual respect with a team first mindset. Working at Tiger, you’ll be at the heart of this AI revolution. You’ll work with teams that push the boundaries of what-is- possible and build solutions that energize and inspire. We are headquartered in the Silicon Valley and have our delivery centers across the globe. The below role is for our Chennai.About the role:You will work on teams building a variety of big data analytics solutions including big datalakes. More specifically, you will work on:Scalable data ingestion pipelines to handle real time streams, CDC events, and batch data High-performance data processing for structured and unstructured data, and data harmonizationScheduling, orchestrating, and validating pipelinesException handling and log monitoring for debuggingCollaborate with business consultants, data scientists, engineers, and application developers to develop analyticssolutionsJob RequirementRequired Experience, Skills & Competencies:Hadoop ecosystem - HDFS, Hive, Sqoop , Kafka, ELK Stack etc Spark, Scala , Python and core/advance JavaNOSQL databases e.g. Hbase, Cassandra, MongoDBRelevant AWS or Azure components required to build big data solutionsGood to know: Databricks, SnowflakeAbility to develop and manage scalable Hadoop cluster environmentsGood understanding of data warehousing concepts, distributed systems, data pipelines, ETL3+ years of professional experience with at least 2 years in big data engineeringDesignation will be commensurate with expertise/experience. Compensation packages are among the best in theindustry.
This advertiser has chosen not to accept applicants from your region.

PySpark Developer

Hyderabad, Andhra Pradesh Anicalls (Pty) Ltd

Posted today

Job Viewed

Tap Again To Close

Job Description

• PySpark Developer / PySpark Data Engineer
• Should have at least three years of strong experience in PySpark
• Strong SQL Expertise
• Strong communication and client-facing skills
This advertiser has chosen not to accept applicants from your region.

Officer-Pyspark Developer

Pune, Maharashtra Citigroup

Posted today

Job Viewed

Tap Again To Close

Job Description

At **Citi** we're not just building technology, we're building the future of banking. Encompassing a broad range of specialties, roles, and cultures, our teams are creating innovations used across the globe. Citi is constantly growing and progressing through our technology, with laser focused on evolving the ways of doing things. As one of the world's most global banks we're changing how the world does business
**Shape your Career with Citi**
We're currently looking for a high caliber professional to join our team as **Officer, Pyspark- Developer - Hybrid** (Internal Job Title: **25879970** Pyspark Developer- C11) based in Chennai, India. Being part of our team means that we'll provide you with the resources to meet your unique needs, empower you to make healthy decision and manage your financial well-being to help plan for your future. For instance:
+ We provide programs and services for your physical and mental well-being including access to telehealth options, health advocates, confidential counseling and more. Coverage varies by country.
+ We empower our employees to manage their financial well-being and help them plan for the future.
+ We provide access to an array of learning and development resources to help broaden and deepen your skills and knowledge as your career progresses.
**In this role, you're expected to:**
+ Utilize knowledge of applications development procedures and concepts, and basic knowledge of other technical areas to identify and define necessary system enhancements, including using script tools and analyzing/interpreting code.
+ Consult with users, clients, and other technology groups on issues, and recommend programming solutions, install, and support customer exposure systems
+ Apply fundamental knowledge of programming languages for design specifications.
+ Analyze applications to identify vulnerabilities and security issues, as well as conduct testing and debugging
+ Serve as advisor or coach to new or lower level analysts
**As a successful candidate, you'd ideally have the following skills and exposure:**
**4-8 years relevant experience in Data Analytics OR Big Data**
**Hands on experience over SQL , Python, Pyspark, with Spark components**
**2-4 years of experience as Big Data Engineer to develop, optimize, and manage large-scale data processing systems and analytics platforms.**
**4 years of experience in distributed data processing & near real-time data analytics using PySpark.**
+ **ETL Experience preferred over Abinitio.**
**Strong understanding of PySpark execution plans, partitioning & optimization techniques.**
Working at Citi is far more than just a job. A career with us means joining a family of more than 230,000 dedicated people from around the globe. At Citi, you'll have the opportunity to grow your career, give back to your community and make a real impact.
**Take the next step in your career, apply for this role at Citi today**
_ Family Group:**
Technology
---
**Job Family:**
Applications Development
---
**Time Type:**
Full time
---
**Most Relevant Skills**
Please see the requirements listed above.
---
**Other Relevant Skills**
PySpark.
---
_Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law._
_If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review_ _Accessibility at Citi ( _._
_View Citi's_ _EEO Policy Statement ( _and the_ _Know Your Rights ( _poster._
Citi is an equal opportunity and affirmative action employer.
Minority/Female/Veteran/Individuals with Disabilities/Sexual Orientation/Gender Identity.
This advertiser has chosen not to accept applicants from your region.

Officer-Pyspark Developer

Chennai, Tamil Nadu 12542 Citicorp Services India Private Limited

Posted today

Job Viewed

Tap Again To Close

Job Description

At Citi we’re not just building technology, we’re building the future of banking. Encompassing a broad range of specialties, roles, and cultures, our teams are creating innovations used across the globe. Citi is constantly growing and progressing through our technology, with laser focused on evolving the ways of doing things. As one of the world’s most global banks we’re changing how the world does business

Shape your Career with Citi

We’re currently looking for a high caliber professional to join our team as Officer, Pyspark- Developer - Hybrid (Internal Job Title: 25879970 Pyspark Developer- C11) based in Chennai, India. Being part of our team means that we’ll provide you with the resources to meet your unique needs, empower you to make healthy decision and manage your financial well-being to help plan for your future. For instance:

  • We provide programs and services for your physical and mental well-being including access to telehealth options, health advocates, confidential counseling and more. Coverage varies by country.
  • We empower our employees to manage their financial well-being and help them plan for the future.
  • We provide access to an array of learning and development resources to help broaden and deepen your skills and knowledge as your career progresses.
  • In this role, you’re expected to:

  • Utilize knowledge of applications development procedures and concepts, and basic knowledge of other technical areas to identify and define necessary system enhancements, including using script tools and analyzing/interpreting code.
  • Consult with users, clients, and other technology groups on issues, and recommend programming solutions, install, and support customer exposure systems
  • Apply fundamental knowledge of programming languages for design specifications.
  • Analyze applications to identify vulnerabilities and security issues, as well as conduct testing and debugging
  • Serve as advisor or coach to new or lower level analysts
  • As a successful candidate, you’d ideally have the following skills and exposure:

  • 4-8 years relevant experience in Data Analytics OR Big Data

  • Hands on experience over SQL , Python, Pyspark, with Spark components

  • 2-4 years of experience as Big Data Engineer to develop, optimize, and manage large-scale data processing systems and analytics platforms.

  • 4 years of experience in distributed data processing & near real-time data analytics using PySpark.

  • ETL Experience preferred over Abinitio.
  • Strong understanding of PySpark execution plans, partitioning & optimization techniques.

  • Working at Citi is far more than just a job. A career with us means joining a family of more than 230,000 dedicated people from around the globe. At Citi, you’ll have the opportunity to grow your career, give back to your community and make a real impact.

    Take the next step in your career, apply for this role at Citi today

    ---

    Job Family Group:

    Technology

    ---

    Job Family:

    Applications Development

    ---

    Time Type:

    Full time

    ---

    Most Relevant Skills

    Please see the requirements listed above.

    ---

    Other Relevant Skills

    PySpark.

    ---

    This advertiser has chosen not to accept applicants from your region.

    Lead Pyspark Developer

    Bellandur, Karnataka Synechron

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    Overall Responsibilities:

  • Data Pipeline Development:  Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy.
  • Data Ingestion:  Implement and manage data ingestion processes from a variety of sources (e.g., relational databases, APIs, file systems) to the data lake or data warehouse on CDP.
  • Data Transformation and Processing:  Use PySpark to process, cleanse, and transform large datasets into meaningful formats that support analytical needs and business requirements.
  • Performance Optimization:  Conduct performance tuning of PySpark code and Cloudera components, optimizing resource utilization and reducing runtime of ETL processes.
  • Data Quality and Validation:  Implement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability throughout the pipeline.
  • Automation and Orchestration:  Automate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem.
  • Monitoring and Maintenance:  Monitor pipeline performance, troubleshoot issues, and perform routine maintenance on the Cloudera Data Platform and associated data processes.
  • Collaboration:  Work closely with other data engineers, analysts, product managers, and other stakeholders to understand data requirements and support various data-driven initiatives.
  • Documentation:  Maintain thorough documentation of data engineering processes, code, and pipeline configurations.
  • Software Requirements:

  • Advanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques.
  • Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase.
  • Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala).
  • Familiarity with Hadoop, Kafka, and other distributed computing tools.
  • Experience with Apache Oozie, Airflow, or similar orchestration frameworks.
  • Strong scripting skills in Linux.
  • Category-wise Technical Skills:

  • PySpark:  Advanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques.
  • Cloudera Data Platform:  Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase.
  • Data Warehousing:  Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala).
  • Big Data Technologies:  Familiarity with Hadoop, Kafka, and other distributed computing tools.
  • Orchestration and Scheduling:  Experience with Apache Oozie, Airflow, or similar orchestration frameworks.
  • Scripting and Automation:  Strong scripting skills in Linux.
  • Experience:

  • 5-12 years of experience as a Data Engineer, with a strong focus on PySpark and the Cloudera Data Platform.
  • Proven track record of implementing data engineering best practices.
  • Experience in data ingestion, transformation, and optimization on the Cloudera Data Platform.
  • Day-to-Day Activities:

  • Design, develop, and maintain ETL pipelines using PySpark on CDP.
  • Implement and manage data ingestion processes from various sources.
  • Process, cleanse, and transform large datasets using PySpark.
  • Conduct performance tuning and optimization of ETL processes.
  • Implement data quality checks and validation routines.
  • Automate data workflows using orchestration tools.
  • Monitor pipeline performance and troubleshoot issues.
  • Collaborate with team members to understand data requirements.
  • Maintain documentation of data engineering processes and configurations.
  • Qualifications:

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or a related field.
  • Relevant certifications in PySpark and Cloudera technologies are a plus.
  • Soft Skills:

  • Strong analytical and problem-solving skills.
  • Excellent verbal and written communication abilities.
  • Ability to work independently and collaboratively in a team environment.
  • Attention to detail and commitment to data quality.
  • Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.


    All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.

    This advertiser has chosen not to accept applicants from your region.
    Be The First To Know

    About the latest Pyspark developer Jobs in India !

    Lead Pyspark Developer

    Chennai, Tamil Nadu Synechron

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    Overall Responsibilities:

  • Data Pipeline Development:  Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy.
  • Data Ingestion:  Implement and manage data ingestion processes from a variety of sources (e.g., relational databases, APIs, file systems) to the data lake or data warehouse on CDP.
  • Data Transformation and Processing:  Use PySpark to process, cleanse, and transform large datasets into meaningful formats that support analytical needs and business requirements.
  • Performance Optimization:  Conduct performance tuning of PySpark code and Cloudera components, optimizing resource utilization and reducing runtime of ETL processes.
  • Data Quality and Validation:  Implement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability throughout the pipeline.
  • Automation and Orchestration:  Automate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem.
  • Monitoring and Maintenance:  Monitor pipeline performance, troubleshoot issues, and perform routine maintenance on the Cloudera Data Platform and associated data processes.
  • Collaboration:  Work closely with other data engineers, analysts, product managers, and other stakeholders to understand data requirements and support various data-driven initiatives.
  • Documentation:  Maintain thorough documentation of data engineering processes, code, and pipeline configurations.
  • Software Requirements:

  • Advanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques.
  • Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase.
  • Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala).
  • Familiarity with Hadoop, Kafka, and other distributed computing tools.
  • Experience with Apache Oozie, Airflow, or similar orchestration frameworks.
  • Strong scripting skills in Linux.
  • Category-wise Technical Skills:

  • PySpark:  Advanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques.
  • Cloudera Data Platform:  Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase.
  • Data Warehousing:  Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala).
  • Big Data Technologies:  Familiarity with Hadoop, Kafka, and other distributed computing tools.
  • Orchestration and Scheduling:  Experience with Apache Oozie, Airflow, or similar orchestration frameworks.
  • Scripting and Automation:  Strong scripting skills in Linux.
  • Experience:

  • 5-12 years of experience as a Data Engineer, with a strong focus on PySpark and the Cloudera Data Platform.
  • Proven track record of implementing data engineering best practices.
  • Experience in data ingestion, transformation, and optimization on the Cloudera Data Platform.
  • Day-to-Day Activities:

  • Design, develop, and maintain ETL pipelines using PySpark on CDP.
  • Implement and manage data ingestion processes from various sources.
  • Process, cleanse, and transform large datasets using PySpark.
  • Conduct performance tuning and optimization of ETL processes.
  • Implement data quality checks and validation routines.
  • Automate data workflows using orchestration tools.
  • Monitor pipeline performance and troubleshoot issues.
  • Collaborate with team members to understand data requirements.
  • Maintain documentation of data engineering processes and configurations.
  • Qualifications:

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or a related field.
  • Relevant certifications in PySpark and Cloudera technologies are a plus.
  • Soft Skills:

  • Strong analytical and problem-solving skills.
  • Excellent verbal and written communication abilities.
  • Ability to work independently and collaboratively in a team environment.
  • Attention to detail and commitment to data quality.
  • Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.


    All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.

    This advertiser has chosen not to accept applicants from your region.

    Officer-Pyspark Developer

    Pune, Maharashtra 12542 Citicorp Services India Private Limited

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    At Citi we’re not just building technology, we’re building the future of banking. Encompassing a broad range of specialties, roles, and cultures, our teams are creating innovations used across the globe. Citi is constantly growing and progressing through our technology, with laser focused on evolving the ways of doing things. As one of the world’s most global banks we’re changing how the world does business

    Shape your Career with Citi

    We’re currently looking for a high caliber professional to join our team as Officer, Pyspark- Developer - Hybrid (Internal Job Title: 25879970 Pyspark Developer- C11) based in Chennai, India. Being part of our team means that we’ll provide you with the resources to meet your unique needs, empower you to make healthy decision and manage your financial well-being to help plan for your future. For instance:

  • We provide programs and services for your physical and mental well-being including access to telehealth options, health advocates, confidential counseling and more. Coverage varies by country.
  • We empower our employees to manage their financial well-being and help them plan for the future.
  • We provide access to an array of learning and development resources to help broaden and deepen your skills and knowledge as your career progresses.
  • In this role, you’re expected to:

  • Utilize knowledge of applications development procedures and concepts, and basic knowledge of other technical areas to identify and define necessary system enhancements, including using script tools and analyzing/interpreting code.
  • Consult with users, clients, and other technology groups on issues, and recommend programming solutions, install, and support customer exposure systems
  • Apply fundamental knowledge of programming languages for design specifications.
  • Analyze applications to identify vulnerabilities and security issues, as well as conduct testing and debugging
  • Serve as advisor or coach to new or lower level analysts
  • As a successful candidate, you’d ideally have the following skills and exposure:

  • 4-8 years relevant experience in Data Analytics OR Big Data

  • Hands on experience over SQL , Python, Pyspark, with Spark components

  • 2-4 years of experience as Big Data Engineer to develop, optimize, and manage large-scale data processing systems and analytics platforms.

  • 4 years of experience in distributed data processing & near real-time data analytics using PySpark.

  • ETL Experience preferred over Abinitio.
  • Strong understanding of PySpark execution plans, partitioning & optimization techniques.

  • Working at Citi is far more than just a job. A career with us means joining a family of more than 230,000 dedicated people from around the globe. At Citi, you’ll have the opportunity to grow your career, give back to your community and make a real impact.

    Take the next step in your career, apply for this role at Citi today

    ---

    Job Family Group:

    Technology

    ---

    Job Family:

    Applications Development

    ---

    Time Type:

    Full time

    ---

    Most Relevant Skills

    Please see the requirements listed above.

    ---

    Other Relevant Skills

    PySpark.

    ---

    This advertiser has chosen not to accept applicants from your region.
     

    Nearby Locations

    Other Jobs Near Me

    Industry

    1. request_quote Accounting
    2. work Administrative
    3. eco Agriculture Forestry
    4. smart_toy AI & Emerging Technologies
    5. school Apprenticeships & Trainee
    6. apartment Architecture
    7. palette Arts & Entertainment
    8. directions_car Automotive
    9. flight_takeoff Aviation
    10. account_balance Banking & Finance
    11. local_florist Beauty & Wellness
    12. restaurant Catering
    13. volunteer_activism Charity & Voluntary
    14. science Chemical Engineering
    15. child_friendly Childcare
    16. foundation Civil Engineering
    17. clean_hands Cleaning & Sanitation
    18. diversity_3 Community & Social Care
    19. construction Construction
    20. brush Creative & Digital
    21. currency_bitcoin Crypto & Blockchain
    22. support_agent Customer Service & Helpdesk
    23. medical_services Dental
    24. medical_services Driving & Transport
    25. medical_services E Commerce & Social Media
    26. school Education & Teaching
    27. electrical_services Electrical Engineering
    28. bolt Energy
    29. local_mall Fmcg
    30. gavel Government & Non Profit
    31. emoji_events Graduate
    32. health_and_safety Healthcare
    33. beach_access Hospitality & Tourism
    34. groups Human Resources
    35. precision_manufacturing Industrial Engineering
    36. security Information Security
    37. handyman Installation & Maintenance
    38. policy Insurance
    39. code IT & Software
    40. gavel Legal
    41. sports_soccer Leisure & Sports
    42. inventory_2 Logistics & Warehousing
    43. supervisor_account Management
    44. supervisor_account Management Consultancy
    45. supervisor_account Manufacturing & Production
    46. campaign Marketing
    47. build Mechanical Engineering
    48. perm_media Media & PR
    49. local_hospital Medical
    50. local_hospital Military & Public Safety
    51. local_hospital Mining
    52. medical_services Nursing
    53. local_gas_station Oil & Gas
    54. biotech Pharmaceutical
    55. checklist_rtl Project Management
    56. shopping_bag Purchasing
    57. home_work Real Estate
    58. person_search Recruitment Consultancy
    59. store Retail
    60. point_of_sale Sales
    61. science Scientific Research & Development
    62. wifi Telecoms
    63. psychology Therapy
    64. pets Veterinary
    View All Pyspark Developer Jobs