StarRailCopilot/doc/future_plan.md
2020-06-07 02:44:02 +08:00

7.4 KiB

Future plan

Here list the future plan for Alas, sorted from easy to hard.

Alas has a lot of handful public functions and class methods, from object detection to map logics, don't handcraft them again.

  • Test chapter 6 when using 1 fleet

  • Add logic to chapter 2, 4, 5, when using 2 fleets

    Chapter 1 to 5 are written in logic of 1 fleet, and I added the logic of 2 fleets in chapter 3, try to imitate that.

  • Add more maps, from 7-3 to 13-4

    9-2, 9-3, 9-4, 10-1, 10-2, already done. Thanks Cerzo, Lanser, whoamikyo

    Read doc/development.md for details.

    Some new logics may not included, read module/map/map.py

  • Add support for hard map

    Alas can use the map file from normal mode, no need to rewrite

    Add a new option in module/config/argparser.py, under the "main chapter".

    Add the arg name into module/config/dictionary.py

    Add arg into config/template.ini

    Load it in module/config/config.py

    Pass that parameter to ensure_campaign_ui in module/campaign/campaign_ui.py line 74

  • Add support for dorm

    Add new file module/reward/dorm.py

    There should be options: DORM_REWARD_INTERVAL (1 hour, 2 hour, 3 hour, 4 hour), USE_GEM_FOOD (yes, no)

    Should able to pick up love and coin in dorm, may hard to pick all of them, not a big deal.

    Call OCR to detect food remain and total

    Don't use template matching to detect which food is available, because the naval curry change from time to time. The unavailable food items have a white overlay, make good use of that.

  • Adaptive parameters for perspective detection

    When alas get an perspective error, it stops. Catch the errors, auto adjust the parameters and retry.

    Read perspective_en.md, see how alas do map detection, and you need to have some basic knowledge of perspective.

  • Restart when get stuck or lost connection

    Don't simply use handle_popup_confirm(), because many module also use that.

    After the restart, retreat from map, because alas can't resume from a unfinished map.

    Alas have highly wrapped API, you may need to touch some low-level functions to do that.

    When you're done, test the speed, see if it effects on the performance.

  • Try ascreencap

    I notice that ALauto use ascreencap, test and see if it's faster than uiautomator2.

    If so, add that option in GUI, and set to default.

  • Update cnocr

    If we use the latest cnocr, we can recognize English letters better, get rid of mxnet==1.4.1, have smaller model size.

    But I already train 2 models under cnocr==1.0.0, named stage and digit, test if they are compatible.

  • Re-train ocr models

    The existing ocr model named stage and digit are poorly trained, you need to make sure letters are full-filled the image.

    The fat letters which is frequently used game is Impact

    The slim letters use in level and oil is AgencyFB-Regular

    The letters in research which looks cool is MStiffHeiHK-UltraBold

    It's better to use only one model to recognize all.

  • Add support research

    Do this after updating cnocr and retraining models.

    Imitating commission.py, use ocr to get research name, classify them, then choose

  • Re-fit the relationship between map swipe and screen swipe

    This function is _map_swipe in module/map/camera.py line 36. If you want to focus camera from D3 to E4, then it will do a swipe (-200, -140). A simple linear fit turn (1, 1) to be (-200, -140).

    I fit this in 7-2, but i found it no working well in some large maps. Alas have to call focus_to_grid_center in line 89 to fix camera position. But sometimes that failed, alas update map data to the wrong grids, and cause a series of error until ensure_edge_insight is called.

    I do know that the relationship between map swipe and screen swipe is complex, but linear would be enough, because we won't swipe too far.

    How close is the camera on the map will also effect the relationship, you can use the mid of vertical lines the measure that. So closer camera means larger diff in vertical.mid, and further swipe, try to fit the the exact relationship.

  • Combine perspective.py and homg_trans_beta brunch in ALauto

    asd111333 shared his thoughts on map detection to me. His project is doing so good when there is only few empty grids on screen which is the weakness of Alas, and ALauto don't have a global map view, which is the strengthen of Alas.

    perspective.py can generate the camera setting needed in his approach, so no longer need to hard-code camera settings for different maps.

    Alas work in 720p and ALauto works in 1080p, so there is a 1.5x scaling.

def get_homo_data(self, tile_width=209/1.5, tile_height=209/1.5):
    hori = self.horizontal[0].add(self.horizontal[-1])
    vert = self.vertical[0].add(self.vertical[-1])
    src_pts = hori.cross(vert).points

    x = len(self.vertical) - 1
    y = len(self.horizontal) - 1

    dst_pts = np.array([[0, 0], [x * tile_width, 0], [0, y * tile_height], [x * tile_width, y * tile_height]], dtype=float)
    dst_pts += src_pts[0]

    return [src_pts * 1.5, dst_pts * 1.5]

I replaced the assets in 720p, now can directly generate the parameter for cv2.warpPerspective

def get_perspective(self):
    tile_width = 140
    tile_height = 140
    # Generate homo from src_pts and dst_pts
    perspective = Perspective(image, config=self.config)
    hori = perspective.horizontal[0].add(perspective.horizontal[-1])
    vert = perspective.vertical[0].add(perspective.vertical[-1])
    src_pts = hori.cross(vert).points
    x = len(perspective.vertical) - 1
    y = len(perspective.horizontal) - 1
    dst_pts = np.array([[0, 0], [x * tile_width, 0], [0, y * tile_height], [x * tile_width, y * tile_height]],
                       dtype=float)
    dst_pts += src_pts[0]
    homo, _ = cv2.findHomography(src_pts, dst_pts)

    # Re-generate to align image
    area = self.config.DETECTING_AREA
    area = np.array([[area[0], area[1]], [area[2], area[1]], [area[0], area[3]], [area[2], area[3]]])
    matrix = homo.dot(np.pad(area, ((0, 0), (0, 1)), mode='constant', constant_values=1).T)
    x, y = matrix[0] / matrix[2], matrix[1] / matrix[2]
    x, y = x - np.min(x), y - np.min(y)
    homo, _ = cv2.findHomography(area, np.array([x, y]).T)

    # Perspective transform
    size = (np.ceil(np.max(x)).astype(int), np.ceil(np.max(y)).astype(int))
    screen_trans = cv2.warpPerspective(np.array(perspective.image), homo, size)
    screen_edge = cv2.Canny(screen_trans, 50, 100)
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
    screen_edge = cv2.morphologyEx(screen_edge, cv2.MORPH_CLOSE, kernel)

    # Find an empty grid
    res = cv2.matchTemplate(screen_edge, free_tile_center, cv2.TM_CCOEFF_NORMED)
    _, similarity, _, location = cv2.minMaxLoc(res)
    logger.info(f'  free_tile_center_sim: {str(round(similarity, 3)).ljust(5, "0")}')

I imagined a way to use two approach. After entering map, call perspective.py to generate camera data, and use homo_trans till the end. If get any error, fallback to perspective.

This will be faster and with lower error rate, and get rid of the parameters of scipy.signal.find_peaks

I got a problem now. Alas use the edge of the map to locate camera. If edge is not insight, locate camera from history swipe which may be unstable. When edges appear again, use that to correct camera location. So even if i use homo_trans, i still need to call perspective from time to time.