Close Menu
    Trending
    • “The success of an AI product depends on how intuitively users can interact with its capabilities”
    • How to Crack Machine Learning System-Design Interviews
    • Music, Lyrics, and Agentic AI: Building a Smart Song Explainer using Python and OpenAI
    • An Anthropic Merger, “Lying,” and a 52-Page Memo
    • Apple’s $1 Billion Bet on Google Gemini to Fix Siri
    • Critical Mistakes Companies Make When Integrating AI/ML into Their Processes
    • Nu kan du gruppchatta med ChatGPT – OpenAI testar ny funktion
    • OpenAI’s new LLM exposes the secrets of how AI really works
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » Robotics with Python: Q-Learning vs Actor-Critic vs Evolutionary Algorithms
    Artificial Intelligence

    Robotics with Python: Q-Learning vs Actor-Critic vs Evolutionary Algorithms

    ProfitlyAIBy ProfitlyAINovember 13, 2025No Comments17 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    There are 4 varieties of Machine Studying: 

    • Supervised — when all of the observations within the dataset are labeled with a goal variable, and you’ll carry out regression/classification to learn to predict them.
    • Unsupervised — when there isn’t any goal variable, so you possibly can carry out clustering to section and group the info.
    • Semi-Supervised — when the goal variable just isn’t full, so the mannequin has to learn to predict unlabeled knowledge as nicely. On this case, a mixture of supervised and unsupervised fashions is used.
    • Reinforcement — when there’s a reward as an alternative of a goal variable and also you don’t know what the very best answer is, so it’s extra of a strategy of trial and error to succeed in a particular purpose.

    Extra exactly, Reinforcement Learning research how an AI takes motion in an interactive surroundings so as to maximize the reward. Throughout supervised coaching, you already know the right reply (the goal variable), and you’re becoming a mannequin to copy it. Quite the opposite, in a RL downside you don’t know apriori what’s the appropriate reply, the one solution to discover out is by taking motion and getting the suggestions (the reward), so the mannequin learns by exploring and making errors.

    RL is being extensively used for coaching robots. An excellent instance is the autonomous vacuum: when it passes on a dusty a part of the ground, it receives a reward (+1), however will get punished (-1) when it bumps into the wall. So the robotic learns what’s the proper motion to do and what to keep away from.

    On this article, I’m going to indicate tips on how to construct customized 3D environments for coaching a robotic utilizing totally different Reinforcement Studying algorithms. I’ll current some helpful Python code that may be simply utilized in different related circumstances (simply copy, paste, run) and stroll by way of each line of code with feedback as a way to replicate this instance.

    Setup

    Whereas a supervised usecase requires a goal variable and a coaching set, a RL downside wants:

    • Setting — the environment of the agent, it assigns rewards for actions, and supplies the brand new state as the results of the choice made. Principally, it’s the house the AI can work together with (within the autonomous vacuum instance can be the room to scrub).
    • Motion — the set of actions the AI can do within the surroundings. The motion house might be “discrete” (when there are a set variety of strikes, like the sport of chess) or “steady” (infinite potential states, like driving a automotive and buying and selling). 
    • Reward —the consequence of the motion (+1/-1).
    • Agent — the AI studying what’s the greatest plan of action within the surroundings to maximise the reward.

    Relating to the surroundings, probably the most used 3D physics simulators are: PyBullet (freshmen) , Webots (intermediate), MuJoCo (superior), and Gazebo (professionals). You should utilize any of them as standalone software program or by way of Gym, a library made by OpenAI for growing Reinforcement Studying algorithms, constructed on high of various physics engines.

    I’ll use Health club (pip set up gymnasium) to load one of many default environments made with MuJoCo (Multi-Joint dynamics with Contact, pip set up mujoco).

    import gymnasium as fitness center
    
    env = fitness center.make("Ant-v4")
    obs, information = env.reset()
    
    print(f"--- INFO: {len(information)} ---")
    print(information, "n")
    
    print(f"--- OBS: {obs.form} ---")
    print(obs, "n")
    
    print(f"--- ACTIONS: {env.action_space} ---")
    print(env.action_space.pattern(), "n")
    
    print(f"--- REWARD ---")
    obs, reward, terminated, truncated, information = env.step( env.action_space.pattern() )
    print(reward, "n")

    The robot Ant is a 3D quadruped agent consisting of a torso and 4 legs hooked up to it. Every leg has two physique components, so in whole it has 8 joints (versatile physique components) and 9 hyperlinks (stable physique components). The purpose of this surroundings is to use drive (push/pull) and torque (twist/flip) to maneuver the robotic in a sure route.

    Let’s attempt the surroundings by operating one single episode with the robotic doing random actions (an episode is an entire run of the agent interacting with the surroundings, from begin to termination).

    import time
    
    env = fitness center.make("Ant-v4", render_mode="human")
    obs, information = env.reset()
    
    reset = False #reset if the episode ends
    episode = 1
    total_reward, step = 0, 0
    
    for _ in vary(240):
        ## motion
        step += 1
        motion = env.action_space.pattern() #random motion
        obs, reward, terminated, truncated, information = env.step(motion)
        ## reward
        total_reward += reward
        ## render
        env.render() #render physics step (CPU velocity = 0.1 seconds)
        time.sleep(1/240) #decelerate to real-time (240 steps × 1/240 second sleep = 1 second)
        if (step == 1) or (step % 100 == 0): #print first step and each 100 steps
            print(f"EPISODE {episode} - Step:{step}, Reward:{reward:.1f}, Whole:{total_reward:.1f}")
        ## reset
        if reset:
            if terminated or truncated: #print the final step
                print(f"EPISODE {episode} - Step:{step}, Reward:{reward:.1f}, Whole:{total_reward:.1f}")
                obs, information = env.reset()
                episode += 1
                total_reward, step = 0, 0
                print("------------------------------------------")
    
    env.shut()

    Customized Setting

    Normally, environments have the same properties:

    1. Reset — to restart to an preliminary state or to a random level inside the knowledge.
    2. Render — to visualise what’s taking place.
    3. Step — to execute the motion chosen by the agent and alter state.
    4. Calculate Reward — to offer the suitable reward/penalties after an motion.
    5. Get Data — to gather details about the sport after an motion.
    6. Terminated or Truncated  — to determine whether or not the episode is completed after an motion (fail or success).

    Having default environments loaded in Health club is handy, nevertheless it’s not all the time what you want. Typically it’s a must to construct a customized surroundings that meets your undertaking necessities. That is probably the most delicate step for a Reinforcement Studying usecase. The standard of the mannequin strongly is determined by how nicely the surroundings is designed.

    There are a number of methods to make your individual surroundings:

    • Create from scratch: you design all the pieces (i.e. the physics, the physique, the environment). You have got whole management nevertheless it’s probably the most sophisticated manner because you begin with an empty world.
    • Modify the present XML file: each simulated agent is designed by an XML file. You’ll be able to edit the bodily properties (i.e. make the robotic taller or heavier) however the logic stays the identical.
    • Modify the present Python class: hold the agent and the physics as they’re, however change the principles of the sport (i.e. new rewards and termination guidelines). One may even flip a steady env right into a discrete motion house.

    I’m going to customise the default Ant surroundings to make the robotic leap. I shall change each the bodily properties within the XML file and the reward perform of the Python class. Principally, I simply want to offer the robotic stronger legs and a reward for leaping.

    To begin with, let’s find the XML file, make a replica, and edit it.

    import os
    
    print(os.path.be part of(os.path.dirname(fitness center.__file__), "envs/mujoco/belongings/ant.xml"))

    Since my goal is to have a extra “jumpy” Ant, I can cut back the density of the physique to make it lighter…

    …and add drive to the legs so it may leap increased (the gravity within the simulator stays the identical).

    You will discover the full edited XML file on my GitHub.

    Then, I need to modify the reward perform of the Health club surroundings. To create a customized env, it’s a must to construct a brand new class that overwrites the unique one the place it’s wanted (in my case, how the reward is calculated). After the brand new env is registered, it may be used like every other Health club env.

    from gymnasium.envs.mujoco.ant_v4 import AntEnv
    from gymnasium.envs.registration import register
    import numpy as np
    
    ## modify the category
    class CustomAntEnv(AntEnv):
        def __init__(self, **kwargs):
            tremendous().__init__(xml_file=os.getcwd()+"/belongings/custom_ant.xml", **kwargs) #specify xml_file provided that modified
    
        def CUSTOM_REWARD(self, motion, information):
            torso_height = float(self.knowledge.qpos[2]) #torso z-coordinate = how excessive it's
            reward = np.clip(a=torso_height-0.6, a_min=0, a_max=1) *10 #when the torso is excessive
            terminated = bool(torso_height < 0.2 ) #if torso near the bottom
            information["torso_height"] = torso_height #add information for logging
            return reward, terminated, information
    
        def step(self, motion):
            obs, reward, terminated, truncated, information = tremendous().step(motion) #override unique step()
            new_reward, new_terminated, new_info = self.CUSTOM_REWARD(motion, information)
            return obs, new_reward, new_terminated, truncated, new_info #should return the identical issues
    
        def reset_model(self):
            return tremendous().reset_model() #preserving the reset as it's
    
    ## register the brand new env
    register(id="CustomAntEnv-v1", entry_point="__main__:CustomAntEnv")
    
    ## check
    env = fitness center.make("CustomAntEnv-v1", render_mode="human")
    obs, information = env.reset()
    for _ in vary(1000):
        motion = env.action_space.pattern()
        obs, reward, terminated, truncated, information = env.step(motion)
        if terminated or truncated:
            obs, information = env.reset()
    env.shut()

    If the 3D world and its guidelines are nicely designed, you simply want a very good RL mannequin, and the robotic will do something to maximise the reward. There are two households of fashions that dominate the RL scene: Q-Studying fashions (greatest for discrete motion areas) and Actor-Critic fashions (greatest for steady motion areas). Apart from these, there are some newer and extra experimental approaches rising, like Evolutionary algorithms and Imitation studying.

    Q Studying

    Q-Learning is probably the most fundamental type of Reinforcement Studying and makes use of Q-values (the “Q” stands for “high quality”) to characterize how helpful an motion is in gaining some future reward. To place it in easy phrases, if on the finish of the sport the agent will get a sure reward after a set of actions, the preliminary Q-value is the discounted future reward.

    Because the agent explores and receives suggestions, it updates the Q-values saved within the Q-matrix (Bellman equation). The purpose of the agent is to study the optimum Q-values for every state/motion, in order that it may make the very best selections and maximize the anticipated future reward for a particular motion in a particular state.

    Through the studying course of, the agent makes use of an exploration-exploitation trade-off. Initially, it explores the surroundings by taking random actions, permitting it to collect expertise (details about the rewards related to totally different actions and states). Because it learns and the extent of exploration decays, it begins exploiting its information by choosing the actions with the best Q-values for every state.

    Please be aware that the Q-matrix might be multidimensional and far more sophisticated. As an example, let’s consider a buying and selling algorithm:

    In 2013, there was a breakthrough within the subject of Reinforcement Studying when Google launched Deep Q-Network (DQN), designed to study to play Atari video games from uncooked pixels, combining the 2 ideas of Deep Studying and Q-Studying. To place it in easy phrases, Deep Studying is used to approximate the Q-values as an alternative of explicitly storing them in a desk. That is completed by way of a Neural Community skilled to foretell the Q-values for every potential motion, utilizing the present state of the surroundings as enter.

    Q-Studying household was primarily designed for discrete environments, so it doesn’t actually work on the robotic Ant. Another answer can be to discretize the surroundings (even when it’s not probably the most environment friendly solution to method a steady downside). We simply must create a wrapper for the Python class that expects a discrete motion (i.e. “transfer ahead”), and consequently applies drive to the joints primarily based on that command.

    class DiscreteEnvWrapper(fitness center.Env):
        
        def __init__(self, render_mode=None): 
            tremendous().__init__() 
            self.env = fitness center.make("CustomAntEnv-v1", render_mode=render_mode) 
            self.action_space = fitness center.areas.Discrete(5)  #can have 5 actions 
            self.observation_space = self.env.observation_space #similar remark house
            n_joints = self.env.action_space.form[0]         
            self.action_map = [
                ## action 0 = stand still 
                np.zeros(n_joints),
                ## action 1 = push all forward
                0.5*np.ones(n_joints),
                ## action 2 = push all backward
               -0.5*np.ones(n_joints),
                ## action 3 = front legs forward + back legs backward 
                0.5*np.concatenate([np.ones(n_joints//2), -np.ones(n_joints//2)]),
                ## motion 4 = entrance legs backward + again legs ahead 
                0.5*np.concatenate([-np.ones(n_joints//2), np.ones(n_joints//2)])
            ] 
            
        def step(self, discrete_action): 
            assert self.action_space.accommodates(discrete_action) 
            continuous_action = self.action_map[discrete_action] 
            obs, reward, terminated, truncated, information = self.env.step(continuous_action) 
            return obs, reward, terminated, truncated, information
            
        def reset(self, **kwargs): 
            obs, information = self.env.reset(**kwargs) 
            return obs, information 
        
        def render(self): 
            return self.env.render() 
        
        def shut(self): 
            self.env.shut()
    
    ## check
    env = DiscreteEnvWrapper()
    obs, information = env.reset()
    
    print(f"--- INFO: {len(information)} ---")
    print(information, "n")
    
    print(f"--- OBS: {obs.form} ---")
    print(obs, "n")
    
    print(f"--- ACTIONS: {env.action_space} ---")
    discrete_action = env.action_space.pattern()
    continuous_action = env.action_map[discrete_action] 
    print("discrete:", discrete_action, "-> steady:", continuous_action, "n")
    
    print(f"--- REWARD ---")
    obs, reward, terminated, truncated, information = env.step( discrete_action )
    print(reward, "n")

    Now this surroundings, with simply 5 potential actions, will certainly work with DQN. In Python, the simplest manner to make use of Deep RL algorithms is thru StableBaseline (pip set up stable-baselines3), a set of probably the most well-known fashions, already pre-implemented and able to go, all written in PyTorch (pip set up torch). Moreover, I discover it very helpful to have a look at the coaching progress on TensorBoard (pip set up tensorboard). I created a folder named “logs”, and I can simply run tensorboard --logdir=logs/ on the terminal to serve the dashboard domestically (http://localhost:6006/).

    import stable_baselines3 as sb
    from stable_baselines3.widespread.vec_env import DummyVecEnv
    
    # TRAIN
    env = DiscreteEnvWrapper(render_mode=None) #no rendering to hurry up
    env = DummyVecEnv([lambda:env]) 
    model_name = "ant_dqn"
    
    print("Coaching START")
    mannequin = sb.DQN(coverage="MlpPolicy", env=env, verbose=0, learning_rate=0.005,
                   exploration_fraction=0.2, exploration_final_eps=0.05, #eps decays linearly from 1 to 0.05
                   tensorboard_log="logs/") #>tensorboard --logdir=logs/
    mannequin.study(total_timesteps=1_000_000, #20min
                tb_log_name=model_name, log_interval=10)
    print("Coaching DONE")
    
    mannequin.save(model_name)

    After the coaching is full, we are able to load the brand new mannequin and check it within the rendered surroundings. Now, the agent gained’t be updating the popular actions anymore. As a substitute, it can use the skilled mannequin to foretell the subsequent greatest motion given the present state.

    # TEST
    env = DiscreteEnvWrapper(render_mode="human")
    mannequin = sb.DQN.load(path=model_name, env=env)
    obs, information = env.reset()
    
    reset = False #reset if episode ends
    episode = 1
    total_reward, step = 0, 0
    
    for _ in vary(1000):
        ## motion
        step += 1
        motion, _ = mannequin.predict(obs)    
        obs, reward, terminated, truncated, information = env.step(motion) 
        ## reward
        total_reward += reward
        ## render
        env.render() 
        time.sleep(1/240)
        if (step == 1) or (step % 100 == 0): #print first step and each 100 steps
            print(f"EPISODE {episode} - Step:{step}, Reward:{reward:.1f}, Whole:{total_reward:.1f}")
        ## reset
        if reset:
            if terminated or truncated: #print the final step
                print(f"EPISODE {episode} - Step:{step}, Reward:{reward:.1f}, Whole:{total_reward:.1f}")
                obs, information = env.reset()
                episode += 1
                total_reward, step = 0, 0
                print("------------------------------------------")
    
    env.shut()

    As you possibly can see, the robotic realized that the very best coverage is to leap, however the actions aren’t fluid as a result of we didn’t use a mannequin designed for steady actions.

    Actor Critic

    In observe, the Actor-Critic algorithms are probably the most used as they’re nicely fitted to steady environments. The essential concept is to have two methods working collectively: a coverage perform (“Actor”) for choosing actions, and a price perform (“Critic”) to estimate the anticipated reward. The mannequin learns tips on how to modify the choice making by evaluating the precise rewards it receives with the predictions.

    The primary steady Deep Studying algorithm was launched by OpenAI in 2016: Advantage Actor-Critic (A2C). It goals to attenuate the loss between the precise reward acquired after the Actor takes motion and the reward estimated by the Critic. The Neural Community is fabricated from an enter layer shared by each the Actor and the Critic, however they return two separate outputs: actions’ Q-values (identical to DQN), and predicted reward (which is the addition of A2C).

    Through the years, the AC algorithms have been bettering with extra steady and environment friendly variants, like Proximal Policy Optimization (PPO), and Soft Actor Critic (SAC). The latter makes use of, not one, however two Critic networks to get a “second opinion”. Do not forget that we are able to use these fashions instantly within the steady surroundings.

    # TRAIN
    env_name, model_name = "CustomAntEnv-v1", "ant_sac"
    env = fitness center.make(env_name) #no rendering to hurry up
    env = DummyVecEnv([lambda:env])
    
    print("Coaching START")
    mannequin = sb.SAC(coverage="MlpPolicy", env=env, verbose=0, learning_rate=0.005, 
                    ent_coef=0.005, #exploration
                    tensorboard_log="logs/") #>tensorboard --logdir=logs/
    mannequin.study(total_timesteps=100_000, #3h
                tb_log_name=model_name, log_interval=10)
    print("Coaching DONE")
    
    ## save
    mannequin.save(model_name)

    The coaching of the SAC requires extra time, however the outcomes are a lot better.

    # TEST
    env = fitness center.make(env_name, render_mode="human")
    mannequin = sb.SAC.load(path=model_name, env=env)
    obs, information = env.reset()
    
    reset = False #reset if the episode ends
    episode = 1
    total_reward, step = 0, 0
    
    for _ in vary(1000):
        ## motion
        step += 1
        motion, _ = mannequin.predict(obs)    
        obs, reward, terminated, truncated, information = env.step(motion) 
        ## reward
        total_reward += reward
        ## render
        env.render() 
        time.sleep(1/240)
        if (step == 1) or (step % 100 == 0): #print first step and each 100 steps
            print(f"EPISODE {episode} - Step:{step}, Reward:{reward:.1f}, Whole:{total_reward:.1f}")
        ## reset
        if reset:
            if terminated or truncated: #print the final step
                print(f"EPISODE {episode} - Step:{step}, Reward:{reward:.1f}, Whole:{total_reward:.1f}")
                obs, information = env.reset()
                episode += 1
                total_reward, step = 0, 0
                print("------------------------------------------")
    
    env.shut()

    Given the recognition of Q-Studying and Actor-Critic, there have been more moderen hybrid variations combining the 2 approaches. On this manner, additionally they prolong DQN to steady motion areas. For instance, Deep Deterministic Policy Gradient (DDPG) and Twin Delayed DDPG (TD3). However, beware that the extra complicated the mannequin, the tougher the coaching.

    Experimental Fashions

    Apart from the principle households (Q and AC), yow will discover different fashions which can be much less utilized in observe, however no much less fascinating. Specifically, they are often highly effective alternate options for duties the place rewards are sparse and exhausting to design. For instance:

    • Evolutionary Algorithms evolve the insurance policies by way of mutation and choice as an alternative of a gradient. Impressed by Darwin’s evolution, they’re sturdy however computationally heavy.
    • Imitation Learning skips exploration and trains brokers to imitate professional demonstrations. It’s primarily based on the idea of “behavioral cloning”, mixing supervised studying with RL concepts.

    For experimental functions, let’s attempt the primary one with EvoTorch, an open-source toolkit for neuroevolution. I’m selecting this as a result of it really works nicely with PyTorch and Health club (pip set up evotorch).

    The most effective Evolutionary Algorithm for RL is Policy Gradients with Parameter Exploration (PGPE). Primarily, it doesn’t prepare one Neural Community instantly, as an alternative it builds a likelihood distribution (Gaussian) over all potential weights (μ=common set of weights, σ=exploration across the middle). In each era, PGPE samples from the weights inhabitants, beginning with a random coverage. Then, the mannequin adjusts the imply and variance primarily based on the reward (evolution of the inhabitants). PGPE is taken into account Parallelized RL as a result of, in contrast to basic strategies like Q and AC, which replace one coverage utilizing batches of samples, PGPE samples many coverage variations in parallel.

    Earlier than operating the coaching, we’ve to outline the “downside”, which is the duty to optimize (principally our surroundings).

    from evotorch.neuroevolution import GymNE
    from evotorch.algorithms import PGPE
    from evotorch.logging import StdOutLogger
    
    ## downside
    prepare = GymNE(env=CustomAntEnv, #instantly the category as a result of it is customized env
                  env_config={"render_mode":None}, #no rendering to hurry up
                  community="Linear(obs_length, act_length)", #linear coverage
                  observation_normalization=True,
                  decrease_rewards_by=1, #normalization trick to stabilize evolution
                  episode_length=200, #steps per episode
                  num_actors="max") #use all out there CPU cores
    
    ## mannequin
    mannequin = PGPE(downside=prepare, popsize=20, stdev_init=0.1, #hold it small
                 center_learning_rate=0.005, stdev_learning_rate=0.1,
                 optimizer_config={"max_speed":0.015})
    
    ## prepare
    StdOutLogger(searcher=mannequin, interval=20)
    mannequin.run(num_generations=100)

    To be able to check the mannequin, we’d like one other “downside” that renders the simulation. Then, we simply extract the best-performing set of weights from the distribution middle (that’s as a result of throughout the coaching the Gaussian shifted towards higher areas of coverage house).

    ## visualization downside
    check = GymNE(env=CustomAntEnv, env_config={"render_mode":"human"},
                 community="Linear(obs_length, act_length)",
                 observation_normalization=True,
                 decrease_rewards_by=1,
                 num_actors=1) #solely want 1 for visualization
    
    ## check greatest coverage
    population_center = mannequin.standing["center"]
    coverage = check.to_policy(population_center)
    
    ## render
    check.visualize(coverage)

    Conclusion

    This text has been a tutorial on tips on how to use Reinforcement Studying for Robotics. I confirmed tips on how to construct 3D simulations with Health club and MuJoCo, tips on how to customise an surroundings, and what RL algorithms are extra fitted to totally different usecases. New tutorials with extra superior robots will come.

    Full code for this text: GitHub

    I hope you loved it! Be at liberty to contact me for questions and suggestions or simply to share your fascinating initiatives.

    👉 Let’s Connect 👈

    (All photos are by the creator until in any other case famous)



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleGoogle Deepmind is using Gemini to train agents inside Goat Simulator 3
    Next Article LLMs Are Randomized Algorithms | Towards Data Science
    ProfitlyAI
    • Website

    Related Posts

    Artificial Intelligence

    “The success of an AI product depends on how intuitively users can interact with its capabilities”

    November 14, 2025
    Artificial Intelligence

    How to Crack Machine Learning System-Design Interviews

    November 14, 2025
    Artificial Intelligence

    Music, Lyrics, and Agentic AI: Building a Smart Song Explainer using Python and OpenAI

    November 14, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Expected Value Analysis in AI Product Management

    November 6, 2025

    Accuracy, Cost, and Performance with NVIDIA Nemotron Models

    August 11, 2025

    The first trial of generative AI therapy shows it might help with depression

    April 3, 2025

    Practical Eigenvectors | Towards Data Science

    May 2, 2025

    Exploring RAFT: The Future of AI with Retrieval-Augmented Fine-Tuning

    April 4, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    How do you teach an AI model to give therapy?

    April 3, 2025

    Why AI Is Widening the Gap Between Top Talent and Everyone Else

    October 21, 2025

    How to Transition From Data Analyst to Data Scientist

    June 9, 2025
    Our Picks

    “The success of an AI product depends on how intuitively users can interact with its capabilities”

    November 14, 2025

    How to Crack Machine Learning System-Design Interviews

    November 14, 2025

    Music, Lyrics, and Agentic AI: Building a Smart Song Explainer using Python and OpenAI

    November 14, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.